Data Engineer designing and building core data systems that power research at EIT. Collaborating with teams across healthcare, robotics, agriculture, and AI.
Responsibilities
**The Role:**
Our Data Engineering Team builds the core data systems that power frontier research across EIT. As an early member of our Data Engineering team, you’ll design and build the platforms used by scientists and engineers in fields such as healthcare, robotics, agriculture, and AI. You’ll work alongside our MLOps and Infrastructure teams to create reliable, scalable systems capable of handling large-scale (from TB to PB+), multimodal datasets.
EIT is unique in combining foundational data from diverse disciplines into a single research ecosystem. You’ll help develop the technical foundation that makes this possible: platforms, services, APIs and distributed systems that are robust, observable and easy to work with. This is a role for engineers who think long-term and want to build a platform that will underpin the next generation of scientific and technological discovery.
**Day-to-Day, You Might:**
Design and build distributed data systems that support research across EIT’s scientific domains.
Architect APIs and services for high-throughput, low-latency access to multimodal datasets.
Work with MLOps, Infrastructure and data engineers embedded within research teams to integrate systems into active research workflows.
Develop pipelines for large-scale text, audio, video, imaging, sensor, and structured data on OCI.
Add observability, monitoring, and automated quality checks to ensure the trustworthiness of every dataset.
Contribute to an engineering culture that values maintainability, testing, clear system design, and deep collaboration with our researchers and scientists.
Requirements
**What Makes You a Great Fit:**
You have strong programming experience in Python and SQL, and value code quality, reliability (including testing, CI/CD) and observability as much as performance.
You have experience designing, deploying, and optimising distributed data systems or data-intensive backend services.
You think in terms of systems and longevity, not just one-off ETL scripts, and embrace end-to-end ownership from low-level performance to user interfaces.
You’re a collaborative partner to Infrastructure/Ops teams and researchers; clear, respectful communicator.
You have a low-ego, team-first mindset and help grow our engineering culture by mentoring, sharing, and elevating the work of those around you.
**Great to Also Have**
**Nobody checks every box - if you’re not sure if you’re qualified, we still encourage you to apply.
You’re used to working with modern tech stacks and developing for distributed systems, for example Spark/Flink/Kafka, Polars/Arrow, Airflow/Prefect.
You’ve contributed to shared Python libraries used across multiple teams and maintained dependency and packaging standards (e.g. Poetry, pip-tools).
You have experience integrating multimodal datasets (text, video, imaging, sensor data) into unified platforms.
You’ve designed and optimised robust, high-performance APIs for data ingestion/consumption using tools such as FastAPI, gRPC, and GraphQL, and use tools such as Prometheus and OpenTelemetry to maintain SLAs.
You’re curious about database internals, storage engines, and low-latency query processing.
You’ve built web apps and dashboards using tools such as Dash or frameworks like React.
You’ve managed schema evolution, data versioning, and governance in production with tools such as Open Policy Agent and Apache Hive Metastore.
Benefits
**We offer the following salary and benefits:**
Enhanced holiday pay
Pension
Life Assurance
Income Protection
Private Medical Insurance
Hospital Cash Plan
Therapy Services
Perk Box
Electric Car Scheme
-
**Why work for EIT:**
At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact!
Data Engineer II leading development and delivery of data pipelines for Syneos Health. Collaborating with teams to optimize data processing and integrate solutions into production environments.
Lead Data Engineer overseeing data operations and analytics engineering teams for OneOncology. Focused on operational excellence in data platform and model reliability for cancer care improvement.
Senior AWS Software Data Engineer at Boeing focusing on AWS Data services to support digital analytics capabilities. Collaborating with cross - functional teams to design, develop, and maintain software data solutions.
Senior Data Engineer designing and improving software for business capabilities at Barclays. Collaborating with teams to build a data and intelligence platform for Equity Derivatives.
Senior AI & Data Engineer developing and implementing AI solutions in collaboration with clients and teams. Working on projects involving generative AI, predictive analytics, and data mastery.
Consultant driving IA business growth in Deloitte's Artificial Intelligence & Data team. Delivering innovative solutions using data analytics and automation technologies.
Data Engineer responsible for managing data architecture and pipelines at Snappi, a neobank. Collaborating with teams to enable data processing and analysis in innovative banking solutions.
Data Engineer at Destinus developing the data platform to support production and analytics needs. Involves migrating Excel sources to Lakehouse and integrating ERP systems in a hybrid role.
Senior Data Engineer developing solutions within the Global Specialty portfolio at an insurance company. Engaging with diverse business partners to ensure high quality data reporting.
Data Engineer at UBDS Group focusing on designing and optimizing modern data platforms. Collaborating in a multidisciplinary team to develop reliable data assets for analytics and operational use cases.