SRE/DevOps Engineer improving platform reliability for multi-award-winning digital payments platform. Working from UK offices and collaborating with engineers to build a developer-friendly platform.
Responsibilities
Design, build and maintain secure, scalable cloud infrastructure across AWS and Azure
Manage and enhance our Kubernetes (EKS) platform to support reliable, modern applications
Develop and maintain Infrastructure as Code using Terraform and Helm
Improve and support CI/CD pipelines using Argo Workflows, ArgoCD and GitHub Actions
Lead and participate in incident response, including on‑call activities and major incident coordination
Drive high‑quality monitoring, alerting and observability across metrics, logs and traces
Conduct and support blameless post‑incident reviews, ensuring follow‑up actions are delivered
Define and implement SLIs/SLOs to improve service reliability and operational excellence
Collaborate with engineering teams to embed best practices and improve developer experience
Contribute to automation, tooling, and continuous improvements that reduce toil and increase platform resilience
Requirements
Proven experience in DevOps, SRE, or Platform Engineering roles
Strong hands‑on experience running Kubernetes in production
Experience with AWS and/or Azure cloud platforms
Solid experience with Terraform and IaC automation
Experience participating in or managing production incidents and on‑call
Strong grasp of monitoring, alerting, and observability principles
Ability to diagnose and fix complex distributed systems issues
Demonstrated use of GenAI tools (ChatGPT, GitHub Copilot, Claude) in engineering workflows
Excellent communication and calmness under pressure
A passion for automation and reducing toil
Benefits
Competitive Salary
Company bonus scheme
Private Healthcare and Medicash plan
26 days holiday + bank holidays plus volunteer days
Tax-saving Salary Sacrifice Pension with Aviva
Salary sacrifice Cycle to Work, Octopus Electric Vehicle, and Nursery fee schemes
Access to a benefits platform with £250 per year for wellbeing and £150 per year for development
Bumper Flex policy for better work/life balance
Annual company-wide Bumper Retreat
4 months paid leave to primary carers and 1 month to secondary carers
DevOps Engineer at FormativGroup focusing on Kubernetes management and automation solutions. Designing, implementing, and securing infrastructure for efficient application deployment in a remote setting.
Senior AWS Cloud Engineer designing and building cloud infrastructure at Emergn. Collaborating with global teams to enhance scalable and reliable delivery of products.
Senior SRE designing and implementing infrastructure to support real - time data processing for Pigment's AI - powered business planning. Collaborating closely with software engineers and taking ownership of performance challenges.
DevOps Engineer responsible for Azure infrastructure development and optimization at Bromcom. Ensuring stability, security, and scalability of the cloud platform with CI/CD automation and monitoring.
DevOps Engineer developing and maintaining CI/CD pipelines using Azure DevOps at RebelDot. Collaborating with teams on cloud and hybrid deployments in Romania.
Staff Software Engineer joining Site Reliability team ensuring performance and reliability of legal AI platform. Designing monitoring and alerting systems while managing operations across global regions.
Senior SRE Technical Lead responsible for reliability and scalability at Adobe's RealTime Customer Data Platform. Overseeing incident response and core datastore strategy in a high impact role.
Director of Site Reliability Engineering at Mastercard, overseeing resilience and operational excellence initiatives. Leading a high - performing team of technical leaders within CX Technology.
SRE responsible for designing and maintaining cloud infrastructure to support scalable applications. Collaborating with product teams to enhance monitoring and response systems in the Czech Republic.
Vehicle Reliability Engineer identifying and resolving issues for Waabi, a leader in Physical AI for autonomous transportation. Collaborating across teams to enhance vehicle reliability and performance.