Hybrid SRE Lead, Observability

Posted 3 weeks ago

Apply now

About the role

  • Develop and lead enterprise observability and reliability capabilities for Parts Town's systems using Dynatrace. Collaborate across teams to ensure comprehensive monitoring and improve performance and incident outcomes.

Responsibilities

  • Own enterprise observability using Dynatrace across cloud, on-prem, ERP, WMS, eCommerce, APIs, and integrations
  • Design service topology, dashboards, alerts, and health indicators that reflect business impact
  • Apply SRE principles (SLIs, SLOs, error budgets where appropriate) to reduce incidents and improve resilience
  • Accelerate incident detection and root-cause analysis; lead post-incident reviews focused on systemic fixes
  • Identify reliability, performance, and capacity risks before they impact the business
  • Define observability and SRE standards and enable teams to use them effectively

Requirements

  • 7+ years in infrastructure, platform, operations, or reliability engineering
  • Hands-on experience implementing and operating Dynatrace
  • Strong understanding of distributed systems, cloud/hybrid environments, and integrations
  • Practical experience with SRE or reliability engineering concepts
  • Comfortable operating in high-impact incident and production environments

Benefits

  • Quarterly profit-sharing bonus
  • Hybrid Work schedule
  • Team member appreciation events and recognition programs
  • Volunteer opportunities
  • Monthly IT stipend
  • Casual dress code
  • On-demand pay options: Access your pay as you earn it, to cover unexpected or even everyday expenses
  • Health insurance
  • 401k/401k match
  • Employee assistance programs
  • Paid time off

Job title

SRE Lead, Observability

Job type

Experience level

Senior

Salary

$99,133 - $133,784 per year

Degree requirement

No Education Requirement

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job