Lead Site Reliability Engineer enhancing SRE practices and ensuring quality software delivery at Cox Automotive. Collaborating with agile teams on cloud-based solutions and automation initiatives.
Responsibilities
Lead efforts to enhance site-reliability engineering practices across agile teams
Ensure the successful delivery of quality software products
Lead the implementation of IaC and automation with unit testing, quality gates, AppSec tooling, automated functional testing, and performance testing
Develop and maintain observability practices, including SLI, SLO, SLA, error budgets, dashboards, alerting, and notification systems
Create and manage runbooks, gamedays, postmortems, and incident response workflows
Engage with cross-functional teams that design, develop and implement enterprise scalable cloud-based solutions to drive improvement in SRE/DevOps best practices and tooling
Play a role in establishing best practices, supporting communities of practice, and promoting adoption within the engineering teams
Be an excellent team member, assisting other team members in achieving the teams sprint goals
Requirements
Bachelor’s degree in a related discipline and 6 years’ experience in a related field
The right candidate could also have a different combination, such as a master’s degree and 4 years’ experience; a Ph.D. and 1 year of experience; or 18 years’ experience in a related field
Applicants must currently be authorized to work in the United States for any employer without current or future sponsorship
Minimum 7 years of relevant professional experience
Minimum 5 years of experience in CI/CD tooling and capabilities
Minimum 5 years of experience in implementing quality frameworks with quality gates within a CI/CD framework
Minimum 5 years of experience working in AWS
Minimum 5 years of experience with Infrastructure as Code, either Terraform or AWS CloudFormation
Minimum 5 years of experience working in observability platforms such as NewRelic, Cloudwatch, Grafana, DataDog
Minimum 5 years of experience working in one of the following programming languages: C#, Java, Python
Minimum 3 years of experience working in AppSec tools such as Veracode, CloudSploit, Data Theorem
Benefits
The flexibility to take as much vacation with pay as they deem consistent with their duties, the company's needs, and its obligations
seven paid holidays throughout the calendar year
up to 160 hours of paid wellness annually for their own wellness or that of family members
additional paid time off in the form of bereavement leave
time off to vote
jury duty leave
volunteer time off
military leave
parental leave
health care insurance (medical, dental, vision)
retirement planning (401(k))
paid days off (sick leave, parental leave, flexible vacation/wellness days, and/or PTO)
DevOps Engineer developing and managing container platforms at Booz Allen. Utilizing cloud technologies to solve client challenges and improve environments while ensuring secure adoption of containers.
Senior Director of DevOps at HUMAN Security leading global teams and modernizing infrastructure for high - scale environments. Responsible for developing strategy and ensuring operational excellence across products.
Manage the DevOps team to deliver reliable internet - scale infrastructure at HUMAN Security. Solve problems related to fraud defense and enhance product capabilities for security researchers.
Senior DevOps Engineer designing deployment systems and overseeing IT projects for PROCITEC. Collaborating in a team - focused environment to deliver innovative technology solutions.
Reliability Engineer I responsible for conducting product inventories at customer locations for Regal Rexnord. Managing workflows and mentoring new engineers while adhering to safety protocols in hybrid work setting.
Senior Manager of Site Reliability Engineering at Insulet overseeing SRE practices and team leadership to enhance system reliability. Driving automation, incident response, and partnership across engineering and product teams.
DevOps Engineer responsible for designing and supporting CI/CD pipelines for Xumo. Collaborating with teams to enhance cloud infrastructure for video streaming services.
Software Developer responsible for developing and optimizing functionalities for a PHP/Symfony platform. Collaborating on projects in a data - driven environment focused on product data solutions.
DevOps Engineer at Perelyn supporting cloud infrastructures and providing technical consulting to clients. Engaging in various DevOps projects within a dynamic remote work environment in Germany.
Site Reliability Engineer ensuring the reliability and performance of cloud - native infrastructure at Sanlam Fintech. Collaborating with teams to deliver innovative solutions across the African continent.