Senior Data Engineer designing scalable ETL data pipelines using Databricks for a software consulting company. Collaborating with teams to implement robust data solutions in diverse business environments.
Responsibilities
Design and implementation of robust, scalable, and high-performance ETL/ELT data pipelines using PySpark/Scala and Databricks SQL on the Databricks platform;
Strong expertise in implementing and optimizing the Medallion Architecture (Bronze, Silver, Gold) using Delta Lake, ensuring data quality, consistency, and historical tracking.
Efficient implementation of the Lakehouse architecture on Databricks, combining best practices from traditional Data Warehousing and Data Lake paradigms;
Optimization of Databricks clusters, Spark operations, and Delta tables (e.g. Z-Ordering, compaction, query tuning) to reduce latency and compute costs;
Design and implementation of real-time and near-real-time data processing solutions using Spark Structured Streaming and Delta Live Tables (DLT);
Implementation and administration of Unity Catalog for centralized data governance, fine-grained security (row- and column-level security), and end-to-end data lineage;
Definition and implementation of data quality standards and validation rules (e.g. using DLT or Great Expectations) to ensure data integrity and reliability;
Development and management of complex workflows using Databricks Workflows (Jobs) or external orchestration tools such as Azure Data Factory or Airflow to automate data pipelines;
Integration of Databricks pipelines into CI/CD processes using Git, Databricks Repos, and Databricks Bundles;
Close collaboration with Data Scientists, Analysts, and Architects to translate business requirements into optimal technical solutions;
Providing technical mentorship to junior engineers and promoting engineering best practices across the team.
Requirements
Proven, expert-level experience across the full Databricks ecosystem, including Workspace management, cluster configuration, notebooks, and Databricks SQL.
In-depth knowledge of Spark architecture (RDDs, DataFrames, Spark SQL) and advanced performance optimization techniques;
Strong expertise in implementing and managing Delta Lake features, including ACID transactions, Time Travel, MERGE operations, OPTIMIZE, and VACUUM;
Advanced/expert proficiency in Python (PySpark) and/or Scala (Spark);
Expert-level SQL skills and strong experience with data modeling approaches (Dimensional Modeling, 3NF, Data Vault);
Solid hands-on experience with a major cloud platform (AWS, Azure, or GCP), with a strong focus on cloud storage services (S3, ADLS Gen2, GCS) and networking fundamentals.
**
**Nice to have**
Practical experience implementing and administering Unity Catalog for centralized governance and fine-grained access control;
Hands-on experience with Delta Live Tables (DLT) and Databricks Workflows for building and orchestrating data pipelines;
Basic understanding of MLOps concepts and hands-on experience with MLflow to support collaboration with Data Science teams;
Experience with Terraform or equivalent Infrastructure as Code (IaC) tools;
Databricks certifications (e.g. Databricks Certified Data Engineer Professional) are considered a significant advantage;
Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related technical field;
5+ years of experience in Data Engineering, including at least 3+ years working with Databricks and Apache Spark at scale.
Benefits
Premium medical package
Lunch Tickets & Pluxee Card
Bookster subscription
13th salary and yearly bonuses
Enterprise job security with a startup mentality (diverse & engaging environment, international exposure, flat hierarchy) under the stability of a secure multinational
A supportive culture (we value ownership, autonomy, and healthy work-life balance) with great colleagues, team events and activities
Flexible working program and openness to remote work
Collaborative mindset – employees shape their own benefits, tools, team events and internal practices
Diverse opportunities in Software Development with international exposure
Flexibility to choose projects aligned with your career path and technical goals
Access to leading learning platforms, courses, and certifications (Pluralsight, Udemy, Microsoft, Google Cloud)
Career growth & learning – mentorship programs, certifications, professional development opportunities, and above-market salary
Cloud Data Engineer implementing tailored solutions for Volkswagen Group data processing. Building ETL/ELT pipelines while collaborating with technical experts.
Data Engineer designing and optimizing data pipelines using Databricks and Google Cloud Platform. Collaborating with analysts and scientists to deliver high - quality data products.
Data Engineer responsible for building scalable data infrastructure that supports data - driven decisions. Collaborating with team to maintain systems and unlock data value for organizations.
Associate Data Engineer supporting privacy engineering controls and executing privacy impact assessments in a financial services company. Collaborating across business units to ensure alignment with privacy regulations.
Data Engineer at CVS Health optimizing data pipelines and analytical models. Driving data - driven decisions with healthcare data for improved business outcomes.
Senior Data Engineer at CVS Health developing robust data pipelines for healthcare data. Collaborating with teams to provide actionable insights and integrate them with consumer touchpoints.
Senior Data Engineer supporting AI - enabled financial compliance initiative with data pipelines and ingestion processes. Collaborating with diverse teams in a mission - critical regulated environment.
Data Architect leading the definition and construction of cloud data architecture for Kyndryl. Participating in significant technological modernization initiatives, focusing on Google Cloud Platform.
Senior Data Engineer driving data intelligence requirements and scalable data solutions for a global consulting firm. Collaborating across functions to enhance Microsoft architecture and analytics capabilities.
Experienced AI Engineer designing and building production - grade agentic AI systems using generative AI and large language models. Collaborating with data engineers, data scientists in a tech - driven company.