Data Engineer crafting data ingestion and transformation processes for an AI healthcare platform. Collaborating with teams to turn complex healthcare data into actionable insights.
Responsibilities
Own data ingestion, transformation, and curation across Bronze, Silver, and Gold layers of our Databricks-based Medallion Architecture.
Manage and optimize data pipelines using Airflow for orchestration and Airbyte (or similar tools) for multi-source ingestion.
Build and maintain connectors and workflows for APIs, EHR/EMR systems (FHIR), resident life, and IoT/monitoring data sources.
Implement batch and streaming pipelines supporting both analytics and near real-time use cases.
Develop and monitor data quality, validation, and profiling frameworks across ingestion points.
Support AI enablement efforts — preparing data for LLM-based insights, population health analytics, and predictive modeling use cases (e.g., fall risk, medication adherence, staffing optimization).
Collaborate closely with data science to enable curated datasets and semantic layers for Superset and AI query interfaces.
Partner with our infrastructure team to maintain infrastructure as code (Terraform) for data services, ensuring scalability and reproducibility.
Partner with security and compliance officers to move towards HIPAA and SOC 2 alignment for all data storage and processing.
Requirements
4–8 years of hands-on data engineering experience.
Strong proficiency with Airflow and Databricks (Spark, Delta Lake, SQL, Python).
Experience building scalable ingestion pipelines with Airbyte, Fivetran, or custom API connectors.
Solid understanding of Azure data ecosystem (Data Lake, Blob Storage, Key Vault, Functions, FHIR Server, etc.).
Experience implementing and maintaining ETL/ELT pipelines in a HIPAA or regulated environment.
Comfort with both SQL and Python for transformations, orchestration, and testing.
Strong grasp of data modeling, schema evolution, and versioned datasets.
Ability to operate independently and deliver results in a small, fast-moving team.
Experience with FHIR and healthcare data structures and interoperability standards.
Familiarity with vector databases (e.g., pgvector, Pinecone) or embedding pipelines for AI/LLM applications.
Experience with GitHub best practices for maintaining and sharing code.
Familiarity with Superset or other analytics tools for internal visualization.
Understanding of security best practices, including encryption, RBAC, and least-privilege design.
Benefits
Christmas Bonus: 30 days, to be paid in December.
Major Medical Expense Insurance: Coverage up to $20,000,000.00 MXN.
Minor Medical Insurance: VRIM membership with special discounts on doctor’s appointments and accident reimbursements.
Dental Insurance: Always smile with confidence!
Life Insurance: (Death and MXN Disability)
Vacation Days: 12 vacation days in accordance with Federal Labor Law, with prior approval from your manager. + Floating Holidays: 3 floating holidays in addition to the 7 official holidays in Mexico.
Snowflake Data Engineer optimizing data pipelines using Snowflake for a global life science company. Collaborate with cross - functional teams for data solutions and performance improvements in Madrid.
Data Engineer designing and implementing big data solutions at DATAIS. Collaborating with clients to deliver actionable business insights and innovative data products in a hybrid environment.
SAP Data Engineer supporting MERKUR GROUP in becoming a data - driven company. Responsible for data integration, ETL processes, and collaboration with various departments.
Big Data Engineer designing and managing data applications on Google Cloud. Join Vodafone’s global tech team to optimize data ingestion and processing for machine learning.
Data Engineer building and maintaining data pipelines for Farfetch’s data platform. Collaborating with the Data team to improve data reliability and architecture in Porto.
Senior Data Engineer at Razer leading initiatives in data engineering and AI infrastructure. Collaborating across teams to develop robust data solutions and enhancing AI/ML projects.
Data Engineering Intern working with data as Jua builds AI for climate and geospatial datasets. Contributing to the integration and validation of new datasets with experienced mentors.
Data Engineer supporting a fintech company in building and maintaining data pipelines. Collaborating with tech teams and enhancing data processing in a high - volume environment.
Senior Data Engineer developing and optimizing data pipelines for Scene+’s cloud - native platform in Toronto. Collaborating across teams to enhance data governance and analytics capabilities.
Staff Engineer developing innovative data solutions for dentsu's B2B marketing vision. Collaborating using cutting - edge cloud technologies and mentoring engineers in their careers.