Hybrid Senior Site Reliability Engineer

Posted 4 weeks ago

Apply now

About the role

  • Senior Site Reliability Engineer leveraging AI tools to monitor and troubleshoot systems at BetterUp. Building cloud infrastructure and collaborating with teams for enhanced reliability.

Responsibilities

  • Leverage AI-powered tools and automation to transform how we monitor, troubleshoot, and maintain production systems
  • Build and operate cloud infrastructure on AWS, using Terraform to codify and version-control our entire environment
  • Manage and scale Kubernetes clusters that power BetterUp's platform, ensuring high availability and performance
  • Design intelligent alerting and observability systems
  • Collaborate with engineering teams to embed reliability into the development lifecycle, shifting left on operational concerns
  • Automate incident response workflows and build self-healing infrastructure
  • Experiment with and adopt emerging AI tools for log analysis, anomaly detection, and predictive maintenance
  • Drive continuous improvement through data-driven retrospectives and reliability metrics

Requirements

  • 4+ years of experience in SRE or infrastructure roles
  • Genuine excitement about AI tooling: you're already using copilots, AI assistants, or LLM-based tools in your workflow and are excited to push your skillset further in this area
  • Deep experience with AWS
  • Hands-on Kubernetes experience: deploying, scaling, debugging, and securing clusters
  • Strong Terraform skills with experience managing complex, multi-environment infrastructure
  • Familiarity with modern observability stacks (Datadog, Prometheus, OpenTelemetry)
  • Strong debugging instincts and comfort navigating distributed systems
  • Clear communication skills - you can explain a production incident to engineers and executives alike
  • A builder's mindset: you see manual processes as opportunities for automation

Benefits

  • Access to BetterUp coaching; one for you and one for a friend or family member
  • A competitive compensation plan with opportunity for advancement
  • Medical, dental, and vision insurance
  • Flexible paid time off
  • All federal/statutory holidays observed
  • 4 BetterUp Inner Workdays
  • 5 Volunteer Days to give back
  • Learning and Development stipend
  • Company wide Summer & Winter breaks
  • Year-round charitable contribution of your choice on behalf of BetterUp
  • 401(k) self contribution

Job title

Senior Site Reliability Engineer

Job type

Experience level

Senior

Salary

$164,000 - $205,000 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job