Senior MLOps Engineer building and operating the platform for identity-verification products at Entrust. Focused on bridging ML research and production environments with an emphasis on developer experience.
Responsibilities
Run and evolve our ML compute layer on Kubernetes/EKS (CPU/GPU) for multi-tenant workloads, and make workloads portable across regions (region-aware scheduling, cross-region data access, and artifact portability).
Operate Argo Workflows and Dask Gateway as reliable, self-serve services used by engineers and researchers to orchestrate data prep, training, evaluation, and large-scale batch compute (installation, upgrades, security, quotas, autoscaling).
Build GitOps -native delivery for ML jobs and platform components (GitLab CI, Helm, FluxCD ) with fast rollouts and safe rollbacks.
Design and maintain our data platform built on LakeFS to enable experiment reproducibility, data lineage tracking, and automated governance processes.
Own developer experience and enablement by creating clear APIs/CLIs and minimal UIs, maintaining comprehensive templates and documentation.
Requirements
You will have some MLOps experience as well as
You value developer experience and enjoy talking to users (engineers/scientists), removing friction, and treating the platform like a product.
Production experience with AWS and Kubernetes (EKS), including GPU workloads.
Proficiency in Python (e.g., FastAPI /Django) and solid CS fundamentals (performance, concurrency, data structures).
Experience building/operating data pipelines (idempotency, retries, backfills, reproducibility).
Working knowledge of Terraform, Helm, Docker, Git, and GitLab CI/CD.
Observability experience with Prometheus/Grafana and logs (e.g., Loki/ Promtail or Splunk/Sentry) with sensible alerting.
Good grasp of networking and security concepts and Linux systems administration.
Benefits
25 days annual leave plus + RTT + 1 day off for your birthday
Two paid volunteering days per year*
Meal Vouchers provided by Swile. 50% Covered and 50% is deducted from your payroll.
Health Insurance (Mutuelle) provided by ALAN
Disability & Life insurance (Prevoyance) provided by ALAN (3x Base Salary)
Commuter reimbursement up to €40 per month
Life enrichment allowance of up to €95 per month to use for services including gym, yoga, fitness classes, massages, childcare, and therapy
Dedicated learning opportunities including using tools like Linkedin Learning with availability to use for learning resources such as books, coaches, conferences, courses, podcasts, and more
Our open and transparent culture is reflected in our “Better Together” motto
Expense up to £300 (or local equivalent) to purchase workstation setup equipment
The opportunity to become a member of Entrust’s resource groups in order to learn different skills in our belonging groups
Machine Learning Engineer designing and implementing AI systems focused on Japanese language challenges at Woven by Toyota. Involves technical R&D, system design, and collaboration with cross - functional teams.
Principal Software Engineer leading MLOps within Analytics Platform at Sun Life. Focused on AWS and machine learning operations, collaborating across technical and business teams.
Machine Learning Engineer designing and optimizing deep learning models for safety - critical environments at Destinus. Shaping the future of high - speed, autonomous flight technologies.
Machine Learning Engineer optimizing personalization systems for Spotify's audio streaming service. Collaborating with cross - functional teams to enhance user experience and deliver recommendations.
Principal Machine Learning Engineer developing ML and GenAI solutions in a cloud - native environment at Flexera. Leading a high - impact team and driving operational excellence for ML infrastructure.
Senior ML Platform/Ops Engineer building AI - powered ML pipelines for a dynamic Ed - Tech company. Collaborating with ML scientists and engineers to ensure reliable deployment and observability.
Senior ML Platform/Ops Engineer building ML systems for AI - powered learning at Preply. Productionizing machine learning with high reliability, performance, and observability in a hybrid environment.
Machine Learning Engineer developing advanced Deep Learning models for autonomous driving technology at Mobileye. Collaborating in a high - end algorithmic engineering team on critical computer vision challenges.
Machine Learning Engineer focusing on vulnerabilities and security of AI systems at Carnegie Mellon University. Collaborating with a team to build robust prototypes and provide solutions for government sponsors.