Senior ML Infrastructure Engineer at Ellison Institute of Technology Oxford | Hybrid Hired

About the role

Join Ellison Institute of Technology as a Senior ML Infrastructure Engineer. Build and operate high-performance ML infrastructure to enable scientific breakthroughs in Oxford.

Responsibilities

**Day-to-day, you might:**
Build, operate, and continuously optimise our high-performance GPU training and inference clusters, focusing on robust, high-availability scheduling, isolation, and automated lifecycle management.
Drive systems design and implementation for high-throughput data paths, optimising I/O, caching, and data locality across compute and storage (including our current Lustre implementation).
Proactively benchmark, profile, and resolve performance bottlenecks across the compute, network, and orchestration layers to maximise efficiency for distributed training and inference.
Establish comprehensive observability, resilience, and automated security controls to ensure compliance and robust operation of sensitive research environments.
Partner with Research, Data, and Applied teams to forecast capacity and cost for GPU and storage needs, setting quotas and streamlining ML experimentation pipelines.

Requirements

**What makes you a great fit:**
Proven experience leading the design, build, and operation of high-performance ML compute clusters at scale
A proactive, autonomous approach to systems design and the proven ability and desire to ideate, co-create and implement optimal solutions
Exposure to migrating or transforming ML infrastructure from traditional schedulers to modern, containerised systems
Expertise with high-throughput storage systems for ML/HPC workloads
Expert-level understanding of GPU architecture, high-speed networking for distributed training, and performance profiling to resolve bottlenecks
A solid grasp of IaC and CI/CD practices (e.g., Terraform, Argo CD)
**It would also be great if you had:**
Experience with Lustre

Benefits

**We offer the following salary and benefits:**
Enhanced holiday pay
Pension
Life Assurance
Income Protection
Private Medical Insurance
Hospital Cash Plan
Therapy Services
Perk Box
Electric Car Scheme
**Why work for EIT:**
At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. We value emotional intelligence, empathy, respect, and resilience, and encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact!

Similar roles

Browse all Infrastructure Engineer jobs

5 hours ago

CG

Media Infrastructure Engineer, Data Center

Consort Group

Infrastructure Engineer modernizing Data Center environments for media content distribution. Involved in technical architecture design and performance optimization for audiovisual workflows.

Hybrid Role

Paris France Infrastructure Engineer

5 hours ago

OR

Senior Infrastructure Engineer

Oritain

Senior Infrastructure Engineer responsible for Azure platform architecture and CI/CD pipelines at Oritain. Collaborating with teams to automate and secure infrastructure while enabling fast engineering.

Hybrid Role

London United Kingdom Infrastructure Engineer

£80,000 - £90,000 per year

6 hours ago

SU

IT Infrastructure Engineer

Sumerge

IT Infrastructure Engineer at Sumegre delivering second - level IT support and troubleshooting assistance. Responsible for network infrastructure maintenance and collaboration with server owners to ensure reliability.

Hybrid Role

Cairo Egypt Infrastructure Engineer

7 hours ago

LA

Senior Cloud Infrastructure Engineer

Langfuse

Cloud Infrastructure Engineer responsible for Langfuse Cloud operations and observability at scale. Managing AWS and ClickHouse deployment to ensure performance and cost optimization.

Hybrid Role

Berlin Germany Infrastructure Engineer

€90,000 - €160,000 per year

yesterday

SA

Site Infrastructure Engineer

SABIC

Site Infrastructure Engineer managing HVAC and utility systems at SABIC. Overseeing maintenance, project activities, and long - term asset strategies for operational efficiency.

Onsite Role

United States Infrastructure Engineer

3 days ago

LG

Infrastructure Engineer – WAF

Lloyds Banking Group

Key engineer developing and operating Web Application Firewall (WAF) platforms at Lloyds Banking Group. Enhancing security and performance while working with modern engineering practices.

Hybrid Role

Leeds United Kingdom Infrastructure Engineer

£48,987 - £55,430 per year

5 days ago

LG

Infrastructure Engineering Lead – Edge Security

Lloyds Banking Group

Lead Infrastructure Engineer driving Edge Security capabilities for Lloyds Banking Group. Focusing on web access protection, Zero Trust architectures, and modern security engineering approaches.

Hybrid Role

Leeds United Kingdom Infrastructure Engineer

£92,701 - £109,060 per year

6 days ago

IM

Senior System Administrator – Infrastructure Engineer

IMAGO

Senior System Administrator & Infrastructure Engineer managing reliable infrastructure and driving DevOps practices at IMAGO. Collaborating with development teams and providing technical guidance to ensure best practices.

Hybrid Role

Berlin Germany Infrastructure Engineer

6 days ago

PY

Infrastructure Engineer – Foundation

Pylon

Infrastructure Engineer maintaining high availability of systems at mortgage platform provider Pylon. Focus on developer productivity and codebase quality with instant feedback from peers.

Hybrid Role

Palo Alto United States Infrastructure Engineer

$140,000 - $220,000 per year

last week

CO

Infrastructure Systems Engineer II

Conduent

Infrastructure Systems Engineer II managing production application support for Conduent. Collaborating on ITIL processes and incident management while working in a 24/7 environment.

Hybrid Role

Hyderabad India Infrastructure Engineer