Staff Software Engineer focused on incident management to improve system reliability at Insulet. Collaborating with Incident Managers and teams to automate detection and response processes.
Responsibilities
Driving the incident management process and coordinating efforts with all teams involved, including SRE, R&D, IT, vendors, and stakeholder, in resolving the incident
Responding to incidents and initiating the incident management process
Prioritizing incidents according to their urgency and business impact
Coordinating response efforts and collaborating with the incident response team to ensure that all protocols are diligently followed
Communicating with internal stakeholders on major incidents and impacts
Producing documents that outline incident timelines and actions taken during the incident
Coordinating post-incident RCAs with responders and SMEs and communicating to stakeholders
Design and implement automation for incident detection, triage, and resolution
Develop and maintain runbooks, playbooks, and tooling to streamline incident response
Collaborate with Incident Managers to improve processes and reduce Mean Time to Recovery (MTTR)
Participate in major incident response efforts, providing technical leadership during high-severity events
Lead post-incident reviews and implement preventive measures to avoid recurrence
Requirements
Bachelor’s degree required (preferred field of study: Computer Science, Engineering, or related field)
7+ years of experience in software engineering, operations, or reliability roles
Minimum 3+ years focused on incident management or operational resilience
Proven track record of improving incident response processes and reducing MTTR
Proven experience architecting and managing highly available, scalable, and fault-tolerant systems
Strong understanding of cloud computing platforms (e.g., AWS, Azure, GCP) and container orchestration technologies (e.g., Kubernetes)
Strong understanding of incident management principles and frameworks (e.g., ITIL)
Hands-on experience with incident response in complex, distributed systems
Proficiency in scripting or automation (Python, Bash, or similar) for operational tasks
Familiarity with monitoring and alerting tools (e.g., Datadog, Prometheus, Grafana)
ETL/Data Validation QA professional responsible for validating Informatica - to - Oracle PL/SQL migrations and data accuracy in SAP Commissions. Execute manual and automated tests and manage test cases efficiently.
Senior Software Engineer responsible for designing scalable systems at GEICO. Collaborating across teams while guiding quality practices in a fast - paced environment.
Staff Software Engineer developing reliability software for GM Autonomous Vehicles, collaborating across teams to enhance multi - sensor systems and improve data quality.
Senior Software Engineer developing and implementing vehicle simulation components for General Motors. Collaborating with technical experts to optimize performance and maintainability in vehicle modeling.
Senior Software Engineer developing and maintaining datapath software components for F5’s cybersecurity innovations. Collaborating across teams to optimize hardware and software integration.
Software Engineer building tools that shape how Homebase engineers ship software. Contributing to AWS infrastructure while improving internal developer experience as part of a collaborative team.
Staff Software Engineer at Pfizer designing software systems and leveraging AI tools to enhance productivity. Working closely with business units to solve real problems through software solutions.
Principal Software Engineer designing and maintaining software systems that deliver business value at Pfizer. Focusing on innovative tooling and architecture for enhanced productivity.
Principal Engineer leading AI solutions for Customer Facing Colleagues at Pfizer. Driving technology innovation and collaboration across digital platforms and engineering teams.
Product Engineer at Rose Bikes developing innovative bikes from concept to production, collaborating with international suppliers and internal teams in a hybrid work environment.