Lead Azure Data Engineer designing and optimizing data ecosystems on Microsoft Cloud. Responsible for building scalable data platforms and pipelines for analytics and reporting.
Responsibilities
Lead end-to-end development of scalable data pipelines and orchestration frameworks using Azure Data Factory (ADF), Azure Synapse Analytics, Azure Databricks, and Microsoft Fabric.
Build robust real-time and batch data pipelines, including integration with streaming sources (e.g., Event Hubs, Kafka) and structured streaming engines.
Design and implement Structured Streaming applications in Spark for near-real-time processing of streaming data.
Develop and maintain ETL/ELT pipelines and transformations leveraging Spark, PySpark, SQL, and fabric orchestration capabilities.
Architect and implement data solutions using Microsoft Fabric, including OneLake, Dataflows, warehouses, and Fabric capacity planning to support enterprise analytics.
Collaborate on data governance, cataloging, and asset organization using Unity Catalog within Databricks and Fabric environments.
Manage Microsoft Fabric capacity and resource utilization to optimize performance and cost efficiency for analytics workloads.
Design, deploy, and optimize Databricks dashboards and reporting artifacts for business stakeholders.
Apply best practices for data modelling, caching, file sizing, and performance tuning of Spark and Delta Lake jobs (e.g., Z-ORDER, broadcast joins, adaptive query execution).
Oversee governance, access controls, metadata management, and lineage using Unity Catalog.
Lead and mentor a team of data engineers, fostering best practices in development, operations, documentation, and quality.
Work with cross-functional teams (architecture, BI, data science, DevOps) to translate business requirements into scalable data solutions.
Partner with stakeholders to define data strategy, standards, and architectural roadmaps.
Establish and enforce standards for data quality, testing, monitoring, operational observability, and governance.
Implement secure, compliant data access and lineage frameworks across cloud data platforms.
Implement CI/CD pipelines, infrastructure-as-code for data platform artifacts, and automated testing frameworks for data jobs and workflows.
Requirements
10+ years of hands-on experience in data engineering on Azure with deep expertise in ADF, Synapse, Databricks, and Microsoft Fabric.
Proven experience with real-time data processing, streaming architectures, and Spark Structured Streaming.
Strong proficiency in Azure Data Factory, Spark (PySpark), SQL, Azure Synapse Analytics, Databricks Runtime, and cloud storage.
Solid knowledge of Unity Catalog for data governance, security, and access management.
Experience designing and managing Databricks Dashboards, performance optimization, cost controls, and data platform resource tuning.
Expertise in building scalable, fault-tolerant, and high-throughput batch & streaming data solutions.
Excellent leadership, cross-team collaboration, and communication skills.
Data Engineer managing payment processing and data accuracy while collaborating with financial teams. Building and optimizing data pipelines for transactional data in a hybrid work environment.
Data Engineer building analytical tools for Dry Bulk market data operations at Kpler. Join a team of over 700 experts transforming data into actionable strategies.
Data Engineer developing tools for maintaining data integrity in cargo tracking at Kpler. Collaborating with analysts and engineers to enhance data quality management.
Data Engineer providing support for IBM DataStage ETL jobs at Callibrity. Collaborating with stakeholders and working to modernize technology solutions in a hybrid work environment.
Cloud Data Engineer implementing tailored solutions for Volkswagen Group data processing. Building ETL/ELT pipelines while collaborating with technical experts.
Data Engineer responsible for building scalable data infrastructure that supports data - driven decisions. Collaborating with team to maintain systems and unlock data value for organizations.
Data Engineer designing and optimizing data pipelines using Databricks and Google Cloud Platform. Collaborating with analysts and scientists to deliver high - quality data products.
Associate Data Engineer supporting privacy engineering controls and executing privacy impact assessments in a financial services company. Collaborating across business units to ensure alignment with privacy regulations.
Data Engineer at CVS Health optimizing data pipelines and analytical models. Driving data - driven decisions with healthcare data for improved business outcomes.