Design, implement, and maintain scalable HPC infrastructure (cloud and on-prem) to support GBI’s computational research workloads.
Evaluate and integrate advanced technologies including GPU/TPU acceleration, high-speed interconnects, and parallel file systems.
Manage HPC environments, including Linux-based clusters, schedulers (e.g., Slurm), and high-performance storage systems (e.g., Lustre, BeeGFS, GPFS).
Implement robust monitoring, fault-tolerance, and capacity management for high availability and reliability.
Develop automation scripts and tools (Python, Bash, Ansible, Terraform, Go, Helm, etc) for provisioning, configuration, and scaling HPC resources.
Support reproducible research through containerization (Singularity, Docker, etc), workflow orchestration (Nextflow, Kubernetes, OpenHPC, etc), and MLOps.
Collaborate with researchers to address common bottlenecks in their scientific computing workflows.
Provide technical support and guidance for job scheduling, workflow optimization, and performance tuning.
Collaborate with information security teams to manage user access and protect sensitive research data.
*Additional responsibilities at the senior level:*
Work with the Head of Scientific Compute on long-term strategy and architecture for GBI’s computing platforms.
Collaborate with researchers to understand present and future computational needs and translate them into cloud and HPC requirements and operational policy.
Work with HPC and cloud vendors to ensure computational resources at GBI meet the needs of its researchers
Requirements
**Essential Knowledge, Skills and Experience:**
Bachelor’s or Master’s degree in Computer Science, Computational Biology, Engineering, or related discipline (PhD desirable).
3+ years (5+ years at the senior level) of relevant experience managing HPC systems in research, biological and biomedical, or academic environment.
Ability to work collaboratively with multidisciplinary research teams and translate computational needs into technical solutions.
Excellent communication and documentation skills for both technical and non-technical audiences.
**Technical Expertise**
*At the regular level*
Extensive experience using HPC clusters (or cloud computing) in scientific or research settings.
Proficiency in Linux system administration, networking, and parallel computing (MPI, OpenMP, CUDA, or ROCm).
Experience with using HPC job schedulers (Slurm preferred) and parallel file systems (Lustre, BeeGFS, GPFS).
***At the senior level:***
Extensive experience designing, deploying, and managing HPC clusters (or cloud computing) in scientific or research settings.
Strong proficiency in Linux system administration, networking, and parallel computing (MPI, OpenMP, CUDA, or ROCm).
Extensive expertise with administering HPC job schedulers (Slurm preferred) and parallel file systems (Lustre, BeeGFS, GPFS).
***At all levels:***
Familiarity with containerization, workflow automation, and orchestration tools used in bioinformatics and AI/ML.
Skilled in scripting and automation using Python, Bash, and configuration management tools (Ansible, Terraform).
Demonstrated experience profiling and optimizing scientific or machine learning workloads on large-scale clusters.
Understanding of distributed computing frameworks and GPU-based acceleration techniques.
Benefits
**We offer the following benefits:**
Enhanced holiday pay
Pension
Life Assurance
Income Protection
Private Medical Insurance
Hospital Cash Plan
Therapy Services
Perk Box
Electrical Car Scheme
**Why work for EIT:**
At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact!
**Terms of Appointment:**
Applicants must have the right to work in the United Kingdom. Due to the highly specialised technical nature of the role, exceptional international applicants may be considered for sponsorship where appropriate.
You must be based in, or within easy commuting distance of, Oxford.
During peak periods, some longer hours may be required and some working across multiple time zones due to the global nature of the programme.
Job title
High-Performance Computing Engineer – Generative Biology Institute
Process Engineer responsible for hands - on process improvements in Bolingbrook facility. Enhancing operational processes using engineering and Lean principles.
Mobile Building Engineer maintaining HVAC and building systems for Cushman & Wakefield. Supporting the engineering team in repairs, maintenance and ensuring efficient operation of facilities.
Senior Mechanical Engineer designing and testing UAV components for an innovative UAV development company. Collaborating with aerodynamics and software teams to ensure optimal performance and compliance with standards.
Static Multi Skilled Engineer at BAM FM, maintaining HVAC systems and providing facilities support in Camden, UK. Full - time position with attractive salary and benefits package.
Senior Middleware Engineer with expertise in Oracle WebLogic at Dolby. Managing application server infrastructure and collaborating with DevOps teams on enterprise middleware solutions.
SCADA Engineer applying IIoT development services for manufacturing environments. Focused on systems integration solutions using Ignition platform across North America.
RF GaN Transistor Modeling Engineer in semiconductor, developing predictive models for HEMTs. Collaborating with engineers to ensure high - performance technologies for wireless communications.
Manufacturing Engineer I assisting senior engineers in optimizing manufacturing processes and techniques. Collaborating with teams to implement automation in cable assembly operations.
Chargé(e) d’affaires managing operational and administrative projects for Liebherr's rail systems division. Ensuring customer order compliance from receipt to delivery while maintaining quality and service satisfaction.
Engineer Laboratory at AT&S executing precision tests in Shanghai for high - end IC substrates production. Ensuring quality assurance and collaborating with engineers for continuous improvement in a global setting.