Staff Software Engineer focused on incident management to improve system reliability at Insulet. Collaborating with Incident Managers and teams to automate detection and response processes.
Responsibilities
Driving the incident management process and coordinating efforts with all teams involved, including SRE, R&D, IT, vendors, and stakeholder, in resolving the incident
Responding to incidents and initiating the incident management process
Prioritizing incidents according to their urgency and business impact
Coordinating response efforts and collaborating with the incident response team to ensure that all protocols are diligently followed
Communicating with internal stakeholders on major incidents and impacts
Producing documents that outline incident timelines and actions taken during the incident
Coordinating post-incident RCAs with responders and SMEs and communicating to stakeholders
Design and implement automation for incident detection, triage, and resolution
Develop and maintain runbooks, playbooks, and tooling to streamline incident response
Collaborate with Incident Managers to improve processes and reduce Mean Time to Recovery (MTTR)
Participate in major incident response efforts, providing technical leadership during high-severity events
Lead post-incident reviews and implement preventive measures to avoid recurrence
Requirements
Bachelor’s degree required (preferred field of study: Computer Science, Engineering, or related field)
7+ years of experience in software engineering, operations, or reliability roles
Minimum 3+ years focused on incident management or operational resilience
Proven track record of improving incident response processes and reducing MTTR
Proven experience architecting and managing highly available, scalable, and fault-tolerant systems
Strong understanding of cloud computing platforms (e.g., AWS, Azure, GCP) and container orchestration technologies (e.g., Kubernetes)
Strong understanding of incident management principles and frameworks (e.g., ITIL)
Hands-on experience with incident response in complex, distributed systems
Proficiency in scripting or automation (Python, Bash, or similar) for operational tasks
Familiarity with monitoring and alerting tools (e.g., Datadog, Prometheus, Grafana)
Director of Software Engineering at Acuity leading AI - enabled digital commerce platform development and transforming user experience with modern architecture.
Senior Product Engineer leading application and integration of protection and control solutions by Hubbell. Collaborating with engineering, sales, and customer support to deploy tailored technical solutions.
Software Engineer leading a team to develop high quality software solutions for DoD training systems. Supporting the JTSE program at Joint Staff Complex in Suffolk, VA.
Lead Principal Engineer Specialist at SAE facilitating aviation standards through technical management and collaboration. Recruiting and mentoring volunteers while driving continuous improvement initiatives in a hybrid work environment.
Product Engineer overseeing the technical lifecycle of screening and biomass handling products for Valmet. Collaborating with global teams and providing engineering expertise across the product lifecycle.
Lead ETL Developer responsible for ETL solutions involving data integration and automation. Working in a hybrid environment at Canada Life with a strong emphasis on collaboration.
Senior Software Engineer developing high - quality software solutions for Savanta. Collaborating with cross - functional teams in a hybrid work environment to deliver impactful products.
Technical Lead developing and evolving iTakeControl, a clinical trial patient engagement platform at Red Nucleus. Leading in - house product development with a focus on compliance and mentoring engineers.
Principal Software Engineer developing and enhancing secure software systems for Northrop Grumman's CHORD portfolio. Focused on collaboration, team empowerment, and personal responsibility in a complex technical environment.
Software Engineer developing Python applications on Linux for Northrop Grumman's Space Sector. Collaborating with cross - functional teams to deliver secure, scalable software in a SCIF environment.