Staff Site Reliability Engineer – Observability at CVS Health | Hybrid Hired

About the role

Staff Site Reliability Engineer focusing on observability at CVS Health. Leading design and implementation of observability systems across distributed environments and edge computing.

Responsibilities

Lead the design, implementation, and optimization of observability systems
Collaborate with cross-functional teams to build robust monitoring, alerting, and telemetry solutions
Drive best practices, mentor others, and shape the strategic evolution of our observability ecosystem
Design and implement comprehensive observability solutions tailored for edge computing environments
Define and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and business KPIs
Build and optimize dashboards, visualizations, and alerting systems
Implement distributed tracing and log aggregation systems
Collaborate with engineering teams to ensure applications and infrastructure at edge locations are designed with observability in mind
Drive proactive identification of issues in edge facilities
Lead incident postmortems and implement observability-driven improvements
Develop and maintain tools, scripts, and automation to enhance observability pipelines
Evaluate and integrate industry-standard observability tools

Requirements

7+ years of experience in Site Reliability Engineering, Observability Engineering, or a related field
5+ years of experience with observability tools and platforms such as Prometheus, Grafana, Splunk, ELK, OpenTelemetry, or similar
3+ years of experience with microservices, containerized environments (e.g., Kubernetes, Docker), and distributed systems, particularly in edge deployments
Experience with implementation of AIOps
Strong proficiency in programming/scripting languages (e.g., Python, java) for automation and tooling in distributed environments
Certifications in cloud platforms (Google Cloud Professional certification) or Kubernetes
Knowledge of incident management processes and tools (e.g., ServiceNow, xMatters, Opsgenie) tailored for distributed systems

Benefits

Affordable medical plan options
401(k) plan (including matching company contributions)
Employee stock purchase plan
No-cost programs including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
Paid time off
Flexible work schedules
Family leave
Dependent care resources
Colleague assistance programs
Tuition assistance
Retiree medical access

Similar roles

Browse all Devops Engineer jobs

3 days ago

RE

Site Reliability Engineer II

RELX

Site Reliability Engineer II at LexisNexis Risk Solutions building Terraform modules and CI/CD pipelines. Responsible for developing cloud infrastructure and ensuring reliability, security, and observability.

Onsite Role

London United Kingdom Devops Engineer

4 days ago

LE

Data Transport Infrastructure DevOps Engineer

Leidos

DevOps Engineer supporting cloud modernization for the Department of the Air Force on the Cloud One contract. Involved in systems analysis, security practices, and collaboration with engineering teams.

Hybrid Role

Tewksbury United States Devops Engineer

$69,550 - $125,725 per year

4 days ago

LE

Journeyman Cloud Operations Engineer

Leidos

Journeyman Cloud Operations Engineer maintaining cloud infrastructure across DoD organizations. Supporting DevSecOps and ensuring compliance with security requirements in a high - visibility program.

Onsite Role

Alexandria United States Devops Engineer

$87,100 - $157,450 per year

4 days ago

MA

Site Reliability Engineer

Minor Hotels Europe and Americas

DevOps Engineer managing cloud - native platforms for Capgemini. Collaborating with development, data/ML, and security teams to deliver scalable solutions on Azure.

Onsite Role

Aguascalientes Mexico Devops Engineer

4 days ago

JA

IT Admin – DevSecOps Lead

JamLoop

Head of IT & DevSecOps at JamLoop, managing internal technology and security improvements. Leading strategy and implementation of cloud infrastructure for efficiency and reliability.

Hybrid Role

United States Devops Engineer

$125,000 - $200,000 per year

4 days ago

LY

I&E Maintenance and Reliability Engineer

LyondellBasell

I&E Maintenance and Reliability Engineer at LyondellBasell focused on asset maintenance strategies in a multidisciplinary environment. Collaborating for operational excellence and safety performance at the Pasadena facility.

Onsite Role

Pasadena United States Devops Engineer

4 days ago

TR

Manager, DevOps – Cloud Infrastructure

Thomson Reuters

Manager, DevOps & Cloud Infrastructure overseeing security and operational efficiency in a hybrid environment at Thomson Reuters. Leading teams to deliver secure solutions in on - premises and cloud setups.

Hybrid Role

McLean United States Devops Engineer

$117,700 - $218,600 per year

4 days ago

IO

DevOps Engineer, Customer Care AI Platform Team

IONOS

DevOps Engineer responsible for building and maintaining the infrastructure of IONOS' AI platform. Collaborating on CI/CD pipelines and ensuring system optimization across various locations.

Hybrid Role

Bucharest Romania Devops Engineer

4 days ago

PO

Intermediate DevOps Engineer, AI-Enabled

PointClickCare

DevOps Engineer building and supporting cloud infrastructure at PointClickCare. Collaborate with senior engineers and software teams to enhance AI - enabled workloads and improve system reliability.

Hybrid Role

Mississauga Canada Devops Engineer

$106,000 - $118,000 per year

4 days ago

CO

DevOps Engineer, k8s, Terraform, Grafana, ELK Stack, Kafka, MongoDB

Convercus

DevOps specialist working with Kubernetes and Terraform, ensuring project stability and efficiency for Convercus. Join a small, dynamic team in a hybrid work environment.

Hybrid Role

Inca Spain Devops Engineer

€30,000 - €65,000 per year