Hybrid Site Reliability Engineer

Posted last month

Apply now

About the role

  • Site Reliability Engineer at Qoria managing reliability and observability across global software products for child digital safety technology. Collaborating with teams to design solutions and improve product reliability.

Responsibilities

  • Join our Global SRE team responsible for owning the reliability and observability of our software
  • Work using a DevOps mindset to help teams adopt observability and reliability best practices
  • Manage Infrastructure as Code
  • Be first responders in our Critical Incident Process to ensure children remain protected
  • Work closely with teams to understand their pain points and design practical solutions that remove friction and improve the reliability of their products

Requirements

  • Experience operating cloud platforms and services at scale (we use GCP predominantly, but also highly value experience in AWS or Azure)
  • Knowledge of good monitoring and alerting practices, and experience of handling on-call rotations and pages for production systems
  • Proficiency with Terraform and managing Infrastructure as Code
  • Proficiency with at least one programming language, and a track record of automating away low-value tasks, instead of manually repeating them
  • Proficiency with git and software collaboration tools (we use Github and Jira)
  • Great communication skills, both written and verbal, to document patterns and champion practices among teams
  • A strong sense of ownership and accountability, and a passion for learning
  • Experience managing Kubernetes and associated tooling in production
  • Experience with observability tooling, including tracing and profiling (we use Datadog, but view experience with equivalent tools equally)
  • Experience delivering software end-to-end and managing it in production
  • Familiarity with microservice architectures and debugging distributed systems
  • Understanding of durable computing concepts
  • Strong understanding of security concepts and securing cloud infrastructure
  • A passion for improving the day-to-day experience of developers within the organisation, and the platforms they use
  • Ability to select appropriate cloud reference architectures for reliable applications
  • Ability to develop conceptual understanding of systems on-demand
  • Excellent asynchronous working skills to aid our global team

Benefits

  • Employee Share Scheme
  • Additional leave days
  • Generous pension (up to 5% match)
  • Tech Allowance
  • Cycle to Work and EV salary sacrifice schemes
  • Flexibility in your working arrangement
  • ... and much more

Job title

Site Reliability Engineer

Job type

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job