Data Engineer with expertise in Databricks, SQL, and Python for scalable data solutions. Focused on ETL/ELT pipeline development, auditability, data quality, and automated testing.
Responsibilities
Lead the design, development, and optimization of ETL/ELT pipelines using Databricks, Python, Spark, and Delta Lake.
Architect scalable data solutions using Medallion architecture (Bronze, Silver, Gold layers).
Design and implement data models and transformations using SQL and Python.
Build and maintain audit frameworks to ensure traceability, compliance, and data lineage.
Develop data quality monitoring and automated testing frameworks for pipeline reliability.
Perform data analysis to support operational data requests and user queries.
Collaborate with clinical data teams to analyse IRT/RTSM datasets.
Create and maintain dashboards and reports using BI tools (e.g., Superset Power BI, Tableau, Qlik, or similar).
Help manage CICD and automated code branching/deployment.
Ensure compliance with GxP, CDISC, and other regulatory standards.
Mentor junior engineers and promote engineering best practices.
Requirements
8–10 years of experience in data engineering, with leadership or team lead responsibilities.
Strong hands-on experience with Databricks, Apache Spark, and Delta Lake.
Advanced proficiency in SQL and Python for data transformation and automation.
Experience with ETL/ELT orchestration, job optimization, and performance tuning.
Proven experience designing and implementing audit, data quality, and testing frameworks.
Hands-on experience with IRT/RTSM clinical trial data systems.
Strong data analysis skills and ability to interpret complex datasets.
Experience with BI/reporting tools such as Power BI, Tableau, or Qlik.
Knowledge of clinical data standards (e.g., CDISC, SDTM, ADaM).
Experience with cloud platforms (Azure, AWS, or GCP) and CI/CD pipelines.
Data Engineer creating and managing data pipelines for critical data solutions at S&P Global. Collaborating on enterprise - scale data processing in a supportive, innovative environment.
Data Engineer supporting and evolving data environment in cloud migration. Maintain and optimize existing databases while designing modern data solutions with cross - functional collaboration.
Senior Data Engineer responsible for data pipeline projects at Suprema Gaming. Focus on batch and streaming data solutions while collaborating with business teams.
Senior data leader managing the enterprise data architecture at Breakthru Beverage. Leading high - performing teams in data engineering and defining modern data strategies.
Data Engineer at Equinix implementing data architecture solutions for scalability and analytics. Collaborating with teams to design data pipelines and maintain data models for business objectives.
Data Warehouse Architect developing and optimizing robust data warehouse environments on SAP BW/4HANA. Critical for enabling advanced analytics and reporting across the organization.
Data Engineering Manager leading a new Data Engineering team in Bengaluru. Shaping the design and scaling of core data engineering practices across the organization.
Sr. ETL/Data Warehouse Lead at Huntington designing, developing, and supporting ETL and Data Warehousing framework. Analyzing systems based on specifications and providing technical assistance.
Senior Google Data Architect designing and delivering scalable data solutions on Google Cloud Platform. Collaborating across teams to shape target - state data architectures and influence enterprise data strategy.
Data Engineer developing scalable data lake solutions and optimizing data pipelines at U.S. Bank. Collaborating with teams to manage data governance and cloud migration activities.