Design, build, and optimize scalable ETL and Structured Streaming pipelines in Azure Databricks for real-time and batch ingestion of Flight Status data
Design and implement data ingestion and processing pipelines that consolidate heterogeneous data sources, including APIs, event streams, and file-based feeds, into the OAG lakehouse (Azure Databricks + Delta Lake), ensuring data consistency, reliability, and scalability
Implement and monitor data quality using automated validation, alerting, and observability practices
Develop and maintain orchestration workflows in Apache Airflow, coordinating ingestion and transformation processes across multiple data flows
Build reusable frameworks for schema evolution, error handling, deduplication, and auditing
Collaborate with data platform, analytics, and product teams to define SLAs, data contracts, and performance targets
Optimize Spark and Delta Lake performance for scalability, latency, and cost efficiency
Implement CI/CD pipelines and automation for data workflows using Azure DevOps or equivalent tools
Mentor engineers, review code, contribute to platform design discussions and planning, and help grow data engineering competencies in the team and across OAG
Requirements
Proven track record in data engineering with a strong focus on ETL development and streaming data architectures
Experience with Azure Databricks, Apache Spark (Structured Streaming), and Delta Lake
Proficiency in Python (PySpark) and SQL, with experience transforming large-scale, complex datasets
Hands-on experience in data orchestration and workflow automation (e.g., Apache Airflow or similar)
Experience working in a cloud data environment (preferably Azure) across storage, compute, and pipeline services
Familiarity with streaming or messaging technologies (e.g., Kafka, Event Hubs)
Strong understanding of data quality, validation, and observability practices
Ability to deliver production-grade solutions with a results-oriented and ownership-driven mindset
Experience implementing CI/CD and version-control practices using Azure DevOps, GitHub Actions, or similar tools
Excellent analytical, communication, and collaboration skills
Strong understanding of modern data engineering patterns and ability to design scalable, modular, and reliable data systems
Benefits
Company-provided free lunch every day
Private health insurance
Company bonus scheme
Voluntary participation in a company-supported retirement scheme
Generous annual leave policy that grows with each year of service
Data Engineer building solutions on AWS for high - performance data processing. Leading initiatives in data architecture and analytics for operational support.
Senior Data Engineer overseeing Databricks platform integrity, optimizing data practices for efficient usage. Leading teams on compliance while mentoring a junior Data Engineer.
Associate Data Engineer contributing to software applications development and maintenance using Python. Collaborating with teams for clean coding and debugging practices in Pune, India.
Lead Data Engineer responsible for delivering scalable cloud - based data solutions and managing cross - functional teams. Collaborating with global stakeholders and ensuring high - quality project execution in a fast - paced environment.
Data Engineer focusing on development and optimization of data pipelines in an insurance context. Ensuring data integrity and supporting data - driven decision - making processes.
Data Engineer designing and implementing data pipelines and services for Ford Pro analytics. Working with diverse teams and technologies to drive data - driven solutions.
Full Stack Data Engineer on a Central Engineering Portfolio Team in Chennai delivering curated data products and collaborating with data engineers and product owners.
Data Engineer developing best - in - class data platforms for ClearBank with a focus on data insights and automation. Collaborating closely with stakeholders and supporting data science initiatives.
Data Engineer operating cloud - based data platform for Business Intelligence and Data Science. Collaborating on data architectures and ETL processes for Sparkassen - Finanzgruppe.
Data Engineer at Love, Bonito optimizing data pipelines and ensuring data quality for analytics. Collaborating on data architecture, operations, and compliance for effective data management.