Data Engineer responsible for data infrastructure and pipelines to support drug discovery efforts. Collaborating with scientists and engineers to facilitate data-driven insights in an innovative biotech startup.
Responsibilities
Design and implement data pipelines that harmonize, validate, and version scientific data for downstream use in modeling and analysis
Develop tools and schemas for integrating heterogeneous data types (chemical, image-based, genomic, etc)
Build and maintain scalable data storage systems and APIs to make experimental and model-derived data accessible to scientists and machine learning teams
Collaborate with ML Scientists to prepare and curate datasets for training and evaluating predictive models
Partner with Software Engineers to surface clean, well-structured data to end users through our internal and customer-facing platforms
Establish and enforce best practices for data governance, reproducibility, and lineage tracking
Requirements
4+ years of experience as a Data Engineer, ML Platform Engineer, or similar role
Proficiency building and maintaining data pipelines and ETL processes in python (e.g. using orchestration tools such as Dagster, Airflow, or Prefect)
Experience with cloud-based storage and compute (AWS S3, ECS, etc, or equivalent)
Outstanding written and oral communication skills
Interest in diving deep into the science of a drug discovery and the business of a growing startup
Nice to have: Experience managing and working with scientific data, particularly in chemistry
Benefits
Competitive salary and equity-based compensation
Comprehensive healthcare benefits (including dental and vision)
Opportunity to grow along with a rapidly scaling company
Senior Data Engineer at Transamerica supporting the creation of the Book of Record by building AWS - based data pipelines. Collaborating in an agile environment to enable a single source of truth.
Junior Data Engineer role at Allegro focusing on geospatial data transformation and implementing automation workflows using SQL/Python. Opportunity to work with leading data teams and technologies.
Data Engineer II at Nium focused on building scalable data solutions for global payments. Collaborate with teams on cloud migration and high - impact data projects in a hybrid environment.
Senior GCP Data Engineer leading the design and optimization of scalable data platforms on Google Cloud. Collaborating with cross - functional teams for analytics and business - critical applications.
Data Engineer Lead developing and maintaining data applications and pipelines in Azure for State Street. Collaborating with teams to ensure data integrity and implementing data processing principles.
Data Architect designing and implementing data architectures for pharmaceutical clients. Collaborating with teams to create efficient data ecosystems that drive business value.
Working student in Automation & Data Engineering at Kelvion supporting internal tools and automations using Microsoft Power Platform. Involvement in systems configuration, reports, dashboards, and data structure maintenance.
Senior AWS Data Engineer supporting HHS grant award management, providing technical leadership and maintaining cloud data systems. Ideal for candidates excelling in modernization and operational environments.
Data Engineer at Booz Allen developing technology solutions for clients analyzing large datasets. Collaborating with teams on mission - driven projects using data engineering best practices.
Data Engineer supporting data pipelines for business insights and GenAI applications at Assurity Trusted Solutions. Collaborating with teams on data workflows in a hybrid environment.