Site Reliability Engineer focused on incident management and system resiliency for Enova International. Leading incident responses and driving continuous improvement in system reliability with analytics.
Responsibilities
Lead production incidents as part of our PI PIC (or Incident Commander) rotation after completing training, ensuring clear communication and resolution.
Capture and maintain detailed documentation of incidents, contributing factors, and learnings in formal incident reports.
Deliver documentation that is clear, comprehensive, and accessible to different types of audiences in a timely manner within the established SLAs.
Facilitate and document blameless post-incident reviews that promote learning and continuous improvement.
Collect and analyze incident data to identify systemic issues, risks, and trends.
Work on improvements to how we collect, analyze, and learn from system failures.
Requirements
2+ years experience in a technology or analyst role (e.g., Software Engineering, Systems, Operations, SRE, or Product).
A strong interest in how complex distributed systems operate—and how to make them more reliable.
Analytical and problem-solving skills with a systems-thinking mindset.
Strong communication skills, both verbal and written, with the ability to tailor messaging to technical and non-technical audiences.
Comfort with ambiguity, and the ability to turn vague problems into actionable insights.
Demonstrated maturity, sound judgment, and organizational awareness.
Ability to coordinate the resolution of major incidents and post-incident reviews following Enova’s Incident Management Process
Ability to seamlessly shift between high-urgency incident response and structured project work, with strong organizational skills and the capacity to manage projects independently.
Benefits
Health, dental, and vision insurance including mental health benefits
401(k) matching plus a roth option (U.S. Based employees only)
PTO & paid holidays off
Sabbatical program (for eligible roles)
Summer hours (for eligible roles)
Paid parental leave
DEI groups (B.L.A.C.K. @ Enova, HOLA @ Enova, Women @ Enova, Pride @ Enova, South Asians @ Enova, APEX @ Enova, and Parents @ Enova)
Employee recognition and rewards program
Charitable matching and a paid volunteer day…Plus so much more!
Job title
Site Reliability Engineer – Incident Management, Resiliency
Senior Platform & Reliability Engineer responsible for enhancing service reliability and infrastructure stability. Leading incident response and implementing durable fixes for a scalable platform.
Junior DevOps Engineer in a cross - border payment company in Bangalore. Manage AWS infrastructure, automate deployment processes, and resolve production issues.
Engineer Associate providing real - time reliability support for ERCOT transmission operations. Responsible for intermediate engineering work, engaging in daily studies and compliance activities.
DevOps Engineer at ParentPay Group, designing and maintaining CI/CD pipelines for education technology products. Collaborate with teams to enhance workflows and automate infrastructure processes.
Aerospace Reliability Engineer supporting NASA's Orion spacecraft team by ensuring reliability of systems and components. Collaborating with cross - functional teams on reliability assessments and documentation.
Cybersecurity Engineer developing a world - class cyber program focusing on DevSecOps and security innovation at Dow. Collaborating with teams to enhance security in applications and infrastructure.
Senior DevOps Engineer for Sonatype, shaping the future of secure software development. Collaborating across the stack to design, develop, and scale core products.
DevOps Engineer enhancing AWS and Kubernetes cloud platform products at Luminor. Mentoring engineers and optimizing operational integrity in an Agile environment.
Join Luminor as a Mid/Senior DevOps Engineer to develop and operate cloud platform products. Engage in mentoring and automating processes while collaborating with various teams.
Mid/Senior DevOps Engineer responsible for cloud platform development and operation at Luminor. Mentoring less experienced engineers and collaborating across teams for operational success.