Site Reliability Engineer ensuring the availability and performance of services for autonomous vehicle operations. Collaborating on system design and automation in a robotics-focused environment.
Responsibilities
Design and implement highly scalable and reliable systems to support Zoox's autonomous vehicle platform.
Optimize system performance, reliability, and scalability.
Develop and maintain monitoring, alerting, and reporting systems to ensure proactive identification and resolution of issues.
Collaborate with software engineering teams to improve software architecture, deployment processes, and automation.
Conduct root cause analysis of production issues and implement corrective actions.
Implement disaster recovery and business continuity plans.
Requirements
5+ years of experience in site reliability engineering or a similar role, with a strong background in working with large-scale distributed systems.
Proven experience with cloud platforms such as AWS, GCP, or Azure.
Expertise in container orchestration technologies like Kubernetes.
Deep understanding of networking, storage, and database technologies.
Strong programming skills in languages such as Python, Go, C/C++, or Java.
Experience with infrastructure as code tools such as Terraform, Ansible, Salt, or CloudFormation.
Benefits
paid time off (e.g. sick leave, vacation, bereavement)
DevOps Engineer ensuring stability, scalability, and reliability of justtrack's SaaS platform. Collaborate with development teams, manage cloud infrastructure, and enhance CI/CD processes.
Cloud DevOps Engineer designing and optimizing secure cloud infrastructure on Azure. Collaborating closely with developers for reliable CI/CD processes on cloud - based products.
Staff Site Reliability Engineer responsible for cloud infrastructure implementation and reliability improvements at Auror. Collaborating with engineering teams to enhance production code understanding.
Own availability and strive for operational excellence of Sumo Logic’s observability. Collaborate with global SRE team to optimize operations and improve developer velocity.
Senior Executive supporting technology initiatives in Pune, India. Collaborating globally to connect people and solve complex challenges in a sustainable manner.
DevOps Engineer leading the design, implementation, and optimisation of Kubernetes platforms for Vodafone. Collaborating with product teams to streamline operational processes and enhance developer experience.
Senior Site Reliability Engineer developing scalable systems and automation for high - scale projects at Euna Solutions. Collaborating closely with software developers and mentoring junior engineers.
Senior Site Reliability Engineer responsible for designing scalable systems at Euna Solutions. Collaborating with developers and mentoring juniors while driving automation and reliability.
Senior Site Reliability DevOps Specialist at Boeing overseeing GCP cloud environment and infrastructure. Ensuring reliability, scalability, and automation while collaborating with distributed teams.
Lead DevOps Engineer driving modernization and operational excellence for Enterprise Payments at American Family Insurance. Collaborate across teams and enhance payment processing capabilities.