Reliability Engineer part of the core team ensuring safe and stable Autonomous Vehicle software releases. Collaborating across testing environments to improve reliability and data-driven insights.
Responsibilities
Own the reliability triage framework for the AV software stack, defining how failures from simulation, CI, and on-road validation are detected, categorized, and escalated into actionable insights.
Perform deep debugging and root-cause analysis across autonomy software, ML pipelines, and system integrations, connecting failure symptoms to clear solution paths and corrective actions.
Design and evolve automated triage mechanisms and reliability taxonomies, improving regression detection, flakiness identification, and signal quality as the system and models evolve.
Build and govern reliability data pipelines, providing continuous visibility into stability trends, recurrence patterns, and systemic risks that impact release readiness.
Translate reliability findings into decision-grade communication, influencing prioritization, technical debt reduction, and release confidence in partnership with engineering, safety, and systems stakeholders.
Requirements
Strong proficiency in Python and SQL for automation, analysis, and data pipelines
Proven experience with CI/CD systems (GitHub Actions, Jenkins, GitLab, or equivalent)
Hands-on experience implementing ETL/ELT pipelines for reliability, quality, or system health monitoring
Solid understanding of reliability engineering concepts, including regression tracking, flakiness detection, and failure classification
Strong analytical and cross-stack debugging skills in large-scale software systems
Experience integrating simulation, HIL, or system-level test signals into automated analysis workflows
Track record of effective cross-functional collaboration across engineering, QA, and platform teams
Ability to operate autonomously in high-ambiguity, safety-critical environments
Excellent communication skills for presenting data-driven reliability insights to engineering and technical leadership
Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, Robotics, or a related field—or equivalent experience.
Benefits
GM offers a variety of health and wellbeing benefit programs.
Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more.
Software Deployment Engineer responsible for deployment, configuration, and testing of applications across multiple environments. Collaborating with engineers and teams to ensure seamless deployments.
Senior Software Deployment Engineer responsible for delivery, configuration, and testing in complex environments. Overseeing deployments and mentoring engineers with a focus on automation and reliability.
DevOps Engineer responsible for deploying AWS cloud infrastructure and building CI/CD pipelines. Collaborating with engineering teams to enhance platform reliability and observability in a hybrid work environment.
DevOps Team Leader focusing on improving technology and leading a small team of remote engineers for iGaming solutions in Lisbon. Striving for innovation in sports betting and casino experiences.
Site Reliability Engineer responsible for system reliability and performance at a leading financial services technology company. Collaborating with infrastructure, engineering, and security teams to build robust systems.
Principal Release Engineer leading and orchestrating end - to - end release management at F5. Driving cross - platform coordination and ensuring seamless releases across enterprise transformation programs.
Sr DevOps Manager leading the way in Cloud infrastructure, DevOps, and SRE practices at F5. Empowering engineers and fostering a culture of collaboration and improvement.
Site Reliability Engineer focused on developing and improving Kubernetes configurations for F5's infrastructure. Collaborating with product teams and ensuring operational delivery processes are efficient and reliable.
Senior Site Reliability Engineer developing IT infrastructure and automation solutions for Coinbase. Collaborating with Infrastructure, security, and compliance teams to enhance operational efficiency.
DevOps Engineer joining AI and Innovation team to ensure scalable, secure, and resilient systems at global media agency. Collaborating with UX and AI engineers for next - generation media experiences.