Senior Site Reliability Engineer responsible for reliability and performance of Ford Service Reservation Platform. Leading SRE practices and technical initiatives in a hybrid role based in Dearborn, MI.
Responsibilities
Ensure the reliability, performance, and scalability of the Ford Service Reservation Platform and its associated applications.
Lead the implementation and continuous evolution of Site Reliability Engineering (SRE) practices.
Define, implement, and maintain Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
Collaborate with engineering teams to prioritize reliability work and incident follow-ups.
Own, evolve, and optimize observability solutions using Dynatrace.
Develop and deploy infrastructure as code using Terraform scripts.
Establish and refine Incident Management and Problem Management processes.
Requirements
Bachelor’s degree in Computer Science, Computer Engineering, Systems Engineering or equivalent combination of relevant education and experience.
7+ years of experience in Software Engineering, DevOps, or Systems Administration.
5+ years of dedicated experience in a Site Reliability Engineering (SRE) or Platform Engineering role.
2+ years of experience leading technical initiatives or mentoring junior engineers in an SRE context.
Master’s degree in Computer Science, Computer Engineering, Systems Engineering or related field (even better).
Certifications:
Google Professional Cloud Architect or Google Professional Cloud DevOps Engineer.
Dynatrace Professional Certification.
Terraform Associate Certification.
Platform experience working on high-traffic reservation systems, e-commerce platforms, or automotive service applications.
Benefits
Visa sponsorship is available for this position.
Equal Opportunity Employer.
Reasonable accommodation for the online application process due to disability.
Senior DevOps Engineer developing core infrastructure supporting Shelf products. Focused on building reliable, secure, and scalable systems in hybrid work environment.
Cloud/Kubernetes Engineer supporting hybrid infrastructure across AWS and on - premise Kubernetes environments. Automating tasks and managing production reliability, security, and scalability.
AWS Infrastructure DevOps Engineer at Growth Acceleration Partners supporting AWS environments and infrastructure automation. Focused on reliability, security, and operational efficiency across production environments.
Site Reliability Engineer driving innovation and automation for Banking Solutions and Payments. Collaborating with teams to ensure application performance and reliability in a dynamic environment.
Mainframe SRE working on critical payment systems for fintech, ensuring stability and security. Collaborating with teams to perform root cause analysis and automate processes.
DevOps Engineer responsible for cloud product delivery, platform reliability, and using AI tools in DevOps workflows. Building CI/CD pipelines and optimizing container workloads for security and performance.
Senior DevOps Engineer for Paysafe, designing and deploying AWS applications and infrastructure. Collaborating on cloud environments and improving processes for scalable solutions.
Senior Site Reliability Engineer at Broadridge managing infrastructure design and operational support. Collaborating with teams to improve automation, performance, and reliability of services in a hybrid environment.
DevSecOps Engineer building and maintaining Azure DevOps cloud applications with API backend. Roles include developing CI/CD pipeline and automating backend tasks.
Reliability Engineer II at Cargill applying technical expertise to enhance process and asset reliability. Collaborating with teams to execute engineering strategies for equipment optimization in a salt mine setting.