SRE Observability SLO Engineer at GE Vernova | Hybrid Hired

About the role

SRE Observability SLO Engineer for GE Vernova’s GridOS Platform Engineering team. Building telemetry stack in SaaS reliability for critical energy infrastructure.

Responsibilities

Implement organization-wide telemetry standards covering metrics, logs, and distributed traces across all GridOS SaaS services.
Implement metrics collection for Kubernetes-hosted services (EKS/Rancher) including pod-level, namespace-level, and cluster-level metrics.
Publish and maintain an Observability Runbook library covering onboarding, alert tuning, and dashboard standards for Platform SRE and Production DevOps teams.
Partner with product engineering, Platform SRE, and customer stakeholders to define meaningful Service Level Indicators (SLIs) and Service Level Objectives (SLOs) per product and customer tier.
Build and maintain SLO tooling — error budget burn-rate alerts, burn-rate dashboards, and automated SLO compliance reports.
Design and build operational dashboards covering availability, latency, error rates, and saturation (the 'Golden Signals') for every GridOS SaaS product.
Create executive-level dashboards for SRE leadership and customer-facing uptime/availability reports aligned to contractual SLAs.
Conduct periodic observability health reviews to identify gaps in coverage, reduce MTTD (Mean Time to Detect), and improve MTTR (Mean Time to Resolve).

Requirements

2–3 years in SRE, observability engineering, or infrastructure reliability roles.
Deep expertise with at least one major observability platform — Datadog, Grafana + Prometheus, AWS CloudWatch, Dynatrace, or New Relic.
Hands-on experience implementing SLIs, SLOs, and error budget burn-rate alerting in a production SaaS environment.
Strong understanding of distributed systems telemetry: metrics (Prometheus/CloudWatch), structured logging (CloudWatch Logs Insights, ELK), and distributed tracing (OpenTelemetry, AWS X-Ray).
Experience with Kubernetes observability — kube-state-metrics, node exporters, Helm-deployed monitoring stacks, and namespace-level resource metrics.
Proficiency in at least one query/visualization language: PromQL, Splunk SPL, Datadog Query Language, or CloudWatch Logs Insights query syntax.
Experience designing alerting strategies that minimize alert fatigue through symptom-based and burn-rate approaches.
Scripting skills in Python and/or Bash for automation of monitoring configuration and report generation.

Benefits

Relocation Assistance Provided

Similar roles

Browse all Devops Engineer jobs

1 hour ago

BU

Senior DevOps Engineer – Managed Service

Burendo

Join Burendo as a Senior DevOps Engineer, maintaining critical services and improving operational efficiency in a cloud - first environment.

Hybrid Role

London United Kingdom Devops Engineer

2 hours ago

ST

Learning Content Engineer – Cloud, DevOps

StackFuel

As Learning Content Engineer, developing and enhancing training content for Cloud and DevOps. Engaging in creating practical learning materials from basics to advanced topics.

Hybrid Role

Berlin Germany Devops Engineer

4 hours ago

SO

AWS DevOps Engineer, Microservices

Solventum

AWS DevOps Microservices Engineer at Solventum designing secure and scalable AWS infrastructures. Collaborating with diverse teams for innovative healthcare solutions using cloud technology.

Hybrid Role

Heredia Costa Rica Devops Engineer

5 hours ago

GA

Manager, Dev Ops

GALE

Manager leading a team of DevOps engineers and shaping cloud infrastructure strategy at a technology company in India.

Hybrid Role

Bengaluru India Devops Engineer

6 hours ago

CA

DevOps Engineer

Catena

DevOps Engineer building and maintaining Catena’s scalable platform infrastructure. Collaborating with engineers to enhance CI/CD pipelines and support cloud - native workloads on AWS.

Hybrid Role

Gzira Malta Devops Engineer

7 hours ago

GV

SRE Platform Engineer

GE Vernova

Platform System Reliability Engineer focused on operations of EKS Kubernetes environment for GE Vernova's SaaS grid products. Responsible for the full lifecycle of production clusters from performance tuning to securing infrastructure.

Hybrid Role

United States Devops Engineer

8 hours ago

RA

DevOps Engineer, Ansible Automation Platform

Rabobank

DevOps Engineer responsible for building and operating automation services using Ansible for Rabobank. Collaborating with teams to ensure stable, secure, and auditable infrastructure across multiple servers.

Hybrid Role

Utrecht Netherlands Devops Engineer

€4,024 - €5,747 per month

8 hours ago

OP

AI Deployment Engineer

OpenAI

Engineer collaborating with AI startups to enhance their systems and contribute to OpenAI's products. Engaging in technical problem - solving and building relationships within the startup ecosystem.

Hybrid Role

Singapore Singapore Devops Engineer

9 hours ago

SS

Senior Software Engineer – DevOps/DevSecOps

Sierra Space

Senior Software Engineer designing and developing software applications for space technologies. Leading technical decisions and collaborating on innovative solutions to enhance national security.

Onsite Role

Louisville United States Devops Engineer

$156,666 - $215,436 per year

10 hours ago

NI

DevOps Engineer, Backend Developer

Nitrado

DevOps Engineer responsible for web application operations and developer experience at Nitrado, a global game server hosting provider. Collaborating with developers on automation, Kubernetes, and Docker management.

Hybrid Role

Germany Devops Engineer