Senior Site Reliability Engineer at Onebrief | Hybrid Hired

About the role

Site Reliability Engineer responsible for application reliability and security in DoD environments. Collaborating with Infrastructure & Security team to enhance service quality and operational efficiency.

Responsibilities

You'll own the reliability, scalability, and security of the production application and/or platform.
Building a World-Class Observability Platform: Design, implement, and manage our monitoring, logging, and alerting stack (e.g., Prometheus, Loki, Alloy, and Grafana).
Defining and Upholding Reliability: Define, measure, and own alerting that feeds into our Service Level Objectives (SLOs).
Leading Incident Response: Act as the incident responder and potentially incident commander during critical incidents.
Automating for Scale and Security: Partner with platform engineers to design, build, and manage secure, resilient Kubernetes clusters.
Eliminating Toil and Scaling the Team: Proactively identify and eliminate operational toil by building automation.

Requirements

3 years of experience in Site Reliability Engineering or a related field, with firsthand experience managing mission-critical systems within DoD’s air-gapped environments
An active Top Secret security clearance. U.S. citizenship required.
Experience automating software delivery, deployment, and providing documentation and self-service tools for engineering teams and customers.
A strong understanding of Linux, containerization and orchestration, and virtual machines
Experience with centralized logging, metrics, and observability using tools such as Prometheus, Loki, Grafana, ELK stack, or Datadog.
Networking fundamentals: core protocols and secure configurations.
A deep understanding of incident response processes, with experience conducting thorough root cause analyses and driving continuous improvement
Clear, concise writing; strong documentation habits and async communication.
Core skills and technologies: VMWare, Kubernetes, Docker, Helm, Ansible, Terraform, Linux, AWS, DoD compliance, Monitoring and Observability tools, AWS.

Benefits

Relocation assistance provided
Active Top Secret Clearance required; SCI eligibility is a plus.

Similar roles

Browse all Devops Engineer jobs

6 hours ago

LE

Data Transport Infrastructure DevOps Engineer

Leidos

DevOps Engineer supporting cloud modernization for the Department of the Air Force on the Cloud One contract. Involved in systems analysis, security practices, and collaboration with engineering teams.

Hybrid Role

Tewksbury United States Devops Engineer

$69,550 - $125,725 per year

6 hours ago

LE

Journeyman Cloud Operations Engineer

Leidos

Journeyman Cloud Operations Engineer maintaining cloud infrastructure across DoD organizations. Supporting DevSecOps and ensuring compliance with security requirements in a high - visibility program.

Onsite Role

Alexandria United States Devops Engineer

$87,100 - $157,450 per year

9 hours ago

MA

Site Reliability Engineer

Minor Hotels Europe and Americas

DevOps Engineer managing cloud - native platforms for Capgemini. Collaborating with development, data/ML, and security teams to deliver scalable solutions on Azure.

Onsite Role

Aguascalientes Mexico Devops Engineer

13 hours ago

JA

IT Admin – DevSecOps Lead

JamLoop

Head of IT & DevSecOps at JamLoop, managing internal technology and security improvements. Leading strategy and implementation of cloud infrastructure for efficiency and reliability.

Hybrid Role

United States Devops Engineer

$125,000 - $200,000 per year

14 hours ago

LY

I&E Maintenance and Reliability Engineer

LyondellBasell

I&E Maintenance and Reliability Engineer at LyondellBasell focused on asset maintenance strategies in a multidisciplinary environment. Collaborating for operational excellence and safety performance at the Pasadena facility.

Onsite Role

Pasadena United States Devops Engineer

15 hours ago

TR

Manager, DevOps – Cloud Infrastructure

Thomson Reuters

Manager, DevOps & Cloud Infrastructure overseeing security and operational efficiency in a hybrid environment at Thomson Reuters. Leading teams to deliver secure solutions in on - premises and cloud setups.

Hybrid Role

McLean United States Devops Engineer

$117,700 - $218,600 per year

18 hours ago

IO

DevOps Engineer, Customer Care AI Platform Team

IONOS

DevOps Engineer responsible for building and maintaining the infrastructure of IONOS' AI platform. Collaborating on CI/CD pipelines and ensuring system optimization across various locations.

Hybrid Role

Bucharest Romania Devops Engineer

20 hours ago

PO

Intermediate DevOps Engineer, AI-Enabled

PointClickCare

DevOps Engineer building and supporting cloud infrastructure at PointClickCare. Collaborate with senior engineers and software teams to enhance AI - enabled workloads and improve system reliability.

Hybrid Role

Mississauga Canada Devops Engineer

$106,000 - $118,000 per year

21 hours ago

CO

DevOps Engineer, k8s, Terraform, Grafana, ELK Stack, Kafka, MongoDB

Convercus

DevOps specialist working with Kubernetes and Terraform, ensuring project stability and efficiency for Convercus. Join a small, dynamic team in a hybrid work environment.

Hybrid Role

Inca Spain Devops Engineer

€30,000 - €65,000 per year

23 hours ago

XT

Cloud & DevOps Engineer

XTEL

Cloud & DevOps Engineer at XTEL managing Azure infrastructure and deploying applications. Collaborating within an international team to drive technological excellence.

Hybrid Role

Bologna Italy Devops Engineer