Senior Software Engineer building automation platforms for incident response at Cox Automotive. Focusing on AI-driven reliability solutions and engineering collaboration within the team.
Responsibilities
Build automation that reduces toil and empowers engineering teams
Create tools and platforms that help teams understand and improve their system reliability
Reimagine how we learn from incidents and turn insights into preventive measures
Experiment with new approaches to observability, monitoring, and alerting
Bring your engineering expertise to complex production challenges
Explore how AI can transform incident detection, triage, and response
Partner with teams across the organization to review & analyze incidents and solve reliability problems at scale
Drive technical conversations that shape how Cox Automotive builds resilient systems
Turn operational pain points into engineering opportunities
Define what modern incident response engineering looks like for our organization
Requirements
Professional experience with static languages (Java, C#, Go) and dynamic languages (Python, Ruby, JavaScript) and understand the tradeoffs of each
Distributed systems expertise and understanding of failure modes
Experience building internal platforms , developer tools, or automation that scales
Git/version control and CI/CD pipeline experience
Infrastructure as code and API design experience
Track record eliminating toil through intelligent automation
Production ownership experience (on-call, incident response, observability)
Systems thinking mindset —understanding how components interact at scale
Eager to dig into problems and bring proposed solutions to group discussion
Open to feedback and able to creatively adapt multiple ideas into solutions
Strong technical writing including high and low-level diagramming techniques
Analytical skills and careful attention to detail
Bachelor’s degree in a related discipline and 4 years’ experience in a related field
The right candidate could also have a different combination, such as a master’s degree and 2 years’ experience; a Ph.D. and up to 1 year of experience; or 16 years’ experience in a related field.
Benefits
The Company offers eligible employees the flexibility to take as much vacation with pay as they deem consistent with their duties, the company’s needs, and its obligations
Seven paid holidays throughout the calendar year
Up to 160 hours of paid wellness annually for their own wellness or that of family members
Additional paid time off in the form of bereavement leave, time off to vote, jury duty leave, volunteer time off, military leave, and parental leave
Health care insurance (medical, dental, vision)
Retirement planning (401(k))
Paid days off (sick leave, parental leave, flexible vacation/wellness days, and/or PTO)
Lead DevOps Engineer focused on AWS and Azure data platform solutions. Collaborating with teams to deliver scalable, secure, and highly available solutions.
DevOps Engineer working at GRÜN Software Group to automate and maintain stable infrastructures. Collaborating with teams to improve deployments and processes for better performance.
Linux System Administrator managing IT infrastructures for educational institutions and research. Collaborating on DevOps and HPC projects while ensuring system security and performance.
Azure SRE Engineer responsible for designing and maintaining secure, scalable Azure cloud infrastructure. Driving automation and operational excellence for leading organizations in technology transformation.
Senior Manager of Site Reliability Engineering overseeing Workday Kubernetes based platform. Leading teams while ensuring high availability and collaborating with federal agencies.
Site Reliability Engineer focusing on AWS cloud environments, SRE practices, and system reliability within GFT's team. Collaborating on cloud migrations and observability initiatives.
Senior DevOps Analyst enhancing infrastructure automation in a transformative technology firm. Collaborating on innovative projects in sectors like healthcare, finance, and utilities in Brazil.
Consultant at Minsait supporting technical decisions in infrastructure automation and developing solutions. Collaborating with teams for maintaining and evolving automation platforms.
Practical Trainee focusing on hardware reliability engineering at Sonova. Support reliability improvement initiatives and work closely with experienced engineers on real - life product challenges.