Data Engineer crafting data ingestion and transformation processes for an AI healthcare platform. Collaborating with teams to turn complex healthcare data into actionable insights.
Responsibilities
Own data ingestion, transformation, and curation across Bronze, Silver, and Gold layers of our Databricks-based Medallion Architecture.
Manage and optimize data pipelines using Airflow for orchestration and Airbyte (or similar tools) for multi-source ingestion.
Build and maintain connectors and workflows for APIs, EHR/EMR systems (FHIR), resident life, and IoT/monitoring data sources.
Implement batch and streaming pipelines supporting both analytics and near real-time use cases.
Develop and monitor data quality, validation, and profiling frameworks across ingestion points.
Support AI enablement efforts — preparing data for LLM-based insights, population health analytics, and predictive modeling use cases (e.g., fall risk, medication adherence, staffing optimization).
Collaborate closely with data science to enable curated datasets and semantic layers for Superset and AI query interfaces.
Partner with our infrastructure team to maintain infrastructure as code (Terraform) for data services, ensuring scalability and reproducibility.
Partner with security and compliance officers to move towards HIPAA and SOC 2 alignment for all data storage and processing.
Requirements
4–8 years of hands-on data engineering experience.
Strong proficiency with Airflow and Databricks (Spark, Delta Lake, SQL, Python).
Experience building scalable ingestion pipelines with Airbyte, Fivetran, or custom API connectors.
Solid understanding of Azure data ecosystem (Data Lake, Blob Storage, Key Vault, Functions, FHIR Server, etc.).
Experience implementing and maintaining ETL/ELT pipelines in a HIPAA or regulated environment.
Comfort with both SQL and Python for transformations, orchestration, and testing.
Strong grasp of data modeling, schema evolution, and versioned datasets.
Ability to operate independently and deliver results in a small, fast-moving team.
Experience with FHIR and healthcare data structures and interoperability standards.
Familiarity with vector databases (e.g., pgvector, Pinecone) or embedding pipelines for AI/LLM applications.
Experience with GitHub best practices for maintaining and sharing code.
Familiarity with Superset or other analytics tools for internal visualization.
Understanding of security best practices, including encryption, RBAC, and least-privilege design.
Benefits
Christmas Bonus: 30 days, to be paid in December.
Major Medical Expense Insurance: Coverage up to $20,000,000.00 MXN.
Minor Medical Insurance: VRIM membership with special discounts on doctor’s appointments and accident reimbursements.
Dental Insurance: Always smile with confidence!
Life Insurance: (Death and MXN Disability)
Vacation Days: 12 vacation days in accordance with Federal Labor Law, with prior approval from your manager. + Floating Holidays: 3 floating holidays in addition to the 7 official holidays in Mexico.
Cloud Data Engineer implementing tailored solutions for Volkswagen Group data processing. Building ETL/ELT pipelines while collaborating with technical experts.
Data Engineer designing and optimizing data pipelines using Databricks and Google Cloud Platform. Collaborating with analysts and scientists to deliver high - quality data products.
Data Engineer responsible for building scalable data infrastructure that supports data - driven decisions. Collaborating with team to maintain systems and unlock data value for organizations.
Associate Data Engineer supporting privacy engineering controls and executing privacy impact assessments in a financial services company. Collaborating across business units to ensure alignment with privacy regulations.
Data Engineer at CVS Health optimizing data pipelines and analytical models. Driving data - driven decisions with healthcare data for improved business outcomes.
Senior Data Engineer at CVS Health developing robust data pipelines for healthcare data. Collaborating with teams to provide actionable insights and integrate them with consumer touchpoints.
Senior Data Engineer supporting AI - enabled financial compliance initiative with data pipelines and ingestion processes. Collaborating with diverse teams in a mission - critical regulated environment.
Data Architect leading the definition and construction of cloud data architecture for Kyndryl. Participating in significant technological modernization initiatives, focusing on Google Cloud Platform.
Senior Data Engineer driving data intelligence requirements and scalable data solutions for a global consulting firm. Collaborating across functions to enhance Microsoft architecture and analytics capabilities.
Experienced AI Engineer designing and building production - grade agentic AI systems using generative AI and large language models. Collaborating with data engineers, data scientists in a tech - driven company.