About the role

  • Delivering against GCP and SRE Public Cloud technology roadmaps
  • Collaborating with engineering teams to release and evolve enterprise-class solutions
  • Managing operations of critical banking services, including 24x7 coverage via on-call rota
  • Enhancing resiliency and reliability of customer-facing services
  • Troubleshooting and diagnosing issues with an engineering mindset
  • Building tooling to support service reliability and code quality
  • Working across multiple labs and signature projects in the Digital space
  • Leading Chaos Engineering initiatives to stress test services

Requirements

  • Strong understanding of SRE & DevOps, including experience of Infrastructure as Code and CI/CD pipelines using tools such as Azure DevOps, Terraform, or Jenkins.
  • Proficiency with Incident Management software (ie ServiceNow)
  • Proficient in Dynatrace, Splunk, SRE GCP & Cloud Observability.
  • Demonstrable experience in using orchestrations tools such as Harness.
  • Knowledge of GCP and Azure cloud platforms.
  • Experience in identifying toil and design automated solutions to remove it.
  • Reliability & Performance Management: Design, implement and own the SLOs for critical platform services. Monitor system health, manage error budgets, and drive improvements in Mean Time to Failure (MTTF) and Mean Time to Recovery (MTTR).
  • Incident & Problem Management: Lead incident response and post-mortem analysis. Ensure root cause identification and long-term remediation strategies are implemented.
  • Platform Advocacy & Collaboration: Champion SRE principles across Segments & Propositions Lab. Collaborate with Lab Product Owners, Engineering Leads, and application teams to embed reliability into design and delivery.
  • Technical Leadership: Provide technical oversight across cloud infrastructure, CI/CD pipelines, observability tooling, and automation frameworks. Guide engineers in adopting scalable and resilient solutions.
  • Continuous Improvement: Identify and implement improvements in deployment, monitoring, and alerting processes. Drive automation to reduce toil and improve operational efficiency.
  • Governance & Compliance: Ensure platform services adhere to internal risk, security, and compliance standards. Support audit and regulatory reporting requirements.

Benefits

  • A generous pension contribution of up to 15%
  • An annual performance-related bonus
  • Share schemes including free shares
  • Benefits you can adapt to your lifestyle, such as discounted shopping
  • 30 days’ holiday, with bank holidays on top
  • A range of wellbeing initiatives and generous parental leave policies

Job title

Senior Site Reliability Engineer

Job type

Experience level

Senior

Salary

£70,929 - £106,394 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job