Hybrid Site Reliability Engineer

Posted 3 hours ago

Apply now

About the role

  • Site Reliability Engineer at Manulife responsible for system reliability, performance, and scalability. Collaborating with development teams and enhancing observability while participating in incident response.

Responsibilities

  • Design, implement, and maintain infrastructure, tooling, and automation to improve service reliability and scalability
  • Partner with development teams to ensure applications are designed for reliability, performance, and operational excellence
  • Monitor system health, performance, and availability; troubleshoot and resolve issues to minimize customer impact
  • Build and enhance observability capabilities, including metrics, logs, and traces, to enable early detection of issues
  • Participate in on-call rotations to support critical systems and ensure timely incident response
  • Perform root cause analysis for incidents and drive corrective actions to prevent recurrence
  • Define, track, and improve Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs)
  • Support disaster recovery and resilience initiatives, including recovery testing and continuous improvement of recovery readiness
  • Promote proactive monitoring and alerting practices to reduce mean time to detection (MTTD) and mean time to resolution (MTTR)

Requirements

  • Post-secondary education in Computer Science, Engineering, or related field, or equivalent practical experience
  • 2–5 years of experience in a Site Reliability Engineering, DevOps, or similar reliability-focused role
  • Strong knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and container technologies (e.g., Docker, Kubernetes)
  • Proficiency in scripting and automation using languages such as Python, Bash, or similar
  • Experience with monitoring, logging, and observability tools (e.g., New Relic, Grafana, ADX)
  • Experience with Power Platform tools, particularly Power BI, for operational reporting and insights
  • Familiarity with ITIL and Agile delivery methodologies
  • Strong problem-solving skills, attention to detail, and ability to work effectively under pressure
  • Excellent communication and collaboration skills, with the ability to work across technical and business teams

Benefits

  • Health, dental, mental health, and vision insurance
  • Short- and long-term disability insurance
  • Life and AD&D insurance coverage
  • Adoption/surrogacy benefits
  • Wellness programs
  • Employee/family assistance plans
  • Retirement savings plans including pension with employer matching contributions
  • Financial education and counseling resources
  • Generous paid time off including holidays, vacation, personal, and sick days
  • Full range of statutory leaves of absence

Job title

Site Reliability Engineer

Job type

Experience level

JuniorMid level

Salary

CA$86,100 - CA$136,100 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job