Hybrid Senior Site Reliability Engineer – Hybrid

Posted 1 hour ago

Apply now

About the role

  • Senior Site Reliability Engineer designing and implementing high-reliability platforms for Broadridge. Collaborating with teams across hybrid environments and driving automation and efficiency in service delivery.

Responsibilities

  • Design and implement high-availability, fault-tolerant architectures across on-prem and cloud platforms (AWS)
  • Lead multi-region DR planning, implementation, and testing, including RTO/RPO definition and validation
  • Define and enforce SLOs, SLIs, and error budgets to balance reliability with delivery velocity
  • Drive self-healing automation and proactive remediation strategies
  • Build and maintain infrastructure using Terraform and configuration management tools (e.g., Chef)
  • Develop automation to eliminate manual operational tasks (TOIL reduction)
  • Create reusable modules, pipelines, and guardrails for standardized deployments
  • Automate certificate lifecycle management, key rotation, and security updates
  • Design and implement end-to-end observability (metrics, logs, traces, synthetic monitoring)
  • Build dashboards, alerts, and runbooks to enable fast detection and resolution of incidents
  • Perform root cause analysis (RCA) and lead post-incident reviews with actionable follow-ups
  • Engineer and operate platforms on AWS, including services such as EKS, EC2, RDS/Aurora, Lambda, API Gateway, CloudFront, WAF, ALB/NLB, CloudWatch, X-Ray, IAM, Secrets Manager
  • Lead cloud migrations and modernization initiatives, including legacy system refactoring
  • Identify and resolve performance bottlenecks through testing and analysis
  • Design and support CI/CD pipelines enabling safe, repeatable deployments
  • Partner with security and legal teams to meet regulatory and compliance requirements (e.g., data residency, GDPR-related controls)

Requirements

  • 8+ years of experience in Site Reliability Engineering, Platform Engineering, DevOps, or Systems Engineering
  • Strong programming experience in Python, Java, or similar languages
  • Deep experience with Linux/Unix systems
  • Hands-on expertise with AWS and cloud-native architectures
  • Proven experience with Terraform and Infrastructure as Code
  • Strong understanding of networking, security, and distributed systems
  • Experience operating mission-critical, high-volume platforms

Benefits

  • Professional development opportunities
  • Flexible working hours
  • Health insurance
  • Paid time off

Job title

Senior Site Reliability Engineer – Hybrid

Job type

Experience level

Senior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job