Site Reliability Engineer at Lloyds Banking Group on the Financial Wellbeing Platform, establishing SRE functions and improving service reliability.
Responsibilities
Create documentation that details the establishment of the SRE function within the platform, supported by procedures that outline the guidelines to be followed through the incorporation of existing documentation.
Provide a framework in which to operate the cloud systems under
Lead the transition to cloud infrastructure and improve observability across systems
Identify and eliminate toil through automation
Manage incidents and post-mortems to improve service reliability
Mentor engineers and support team development
Collaborate with Product Owners to balance operational and development priorities
Requirements
Proven experience as a Site Reliability Engineer in cloud environments (GCP or AWS)
Understanding of SRE principles including SLIs, SLOs, error budgets, and toil reduction.
Strong scripting and infrastructure-as-code (IaaC) skills (Terraform, Harness, GitHub)
Demonstrable experience in the Agile ways of working that focuses on delivering customer value and applying the Agile mindset; familiarity with tools like Jira
Ability to lead incident response and drive service improvements
Strong collaboration and mentoring skills
Azure cloud environment experience, including connectivity, data buckets, secrets management, migration, and governance challenges.
Familiarity with containerisation and orchestration tools like Docker, Jenkins, GitHub, and Terraform
Secure programming practices and experience of secure file transfer protocols, risk remediation, and audit actions
Technical operations and service engineering
Benefits
A generous pension contribution of up to 15%
An annual performance-related bonus
Share schemes including free shares
Benefits you can adapt to your lifestyle, such as discounted shopping
30 days’ holiday, with bank holidays on top
A range of wellbeing initiatives and generous parental leave policies
Lead Infrastructure Engineer at U.S. Bank responsible for managing and configuring cloud systems and infrastructure technologies while promoting automation practices.
Site Reliability Engineer focused on automation and optimization of software application performance. Collaborating with cross - functional teams to enhance scalability and reliability in Chennai/Bangalore.
Site Reliability Engineer ensuring the availability and performance of services for autonomous vehicle operations. Collaborating on system design and automation in a robotics - focused environment.
DevOps Engineer automating continuous deployment and monitoring on AWS for Crown Equipment Corporation. Bridging developers, IT, and external providers for operational efficiency.
Senior DevOps Engineer responsible for leading CI/CD pipeline design and optimization. Collaborating with teams to drive DevOps maturity across the enterprise while managing infrastructure automation.
Cloud Operations Engineer ensuring reliable performance of cloud systems at 2Innovate. Focused on automation, incident management, cloud security, and infrastructure monitoring in cloud environments.
AWS DevOps Engineer responsible for delivering scalable digital experiences for EXL's MarTech ecosystem. Engaging in development, maintenance, and collaboration across stakeholders and services.
Senior Site Reliability Engineer managing critical infrastructure at Hornetsecurity. Collaborating with product teams to ensure performance and reliability across services.
Site Reliability Engineer enhancing platform reliability for AI workflows at WRITER. Overseeing automated solutions and cloud infrastructure supporting high - trafficked AI systems.