Site Reliability Engineer at FNBO’s Technical Operations team managing ITSM practices and collaborating with developers. Leading incident response, monitoring, automation, and performance optimization efforts.
Responsibilities
Lead ITSM practices including change, incident, problem and knowledge management
Lead major incident response calls and help resolve issues.
Lead the Change Management process ensuring change preparedness
Lead the Problem management process and perform problem review analysis
Assist in monitoring FNBO systems/applications and follow best practices for proactive monitoring and resolution.
Participate in the on-call rotation to respond to incidents in a timely manner.
Support performance optimization efforts to ensure system peak performance and customer satisfaction.
Assist in conducting Post-Mortems after major incidents to help identify root causes and prevent future issues.
Collaborate with Application Development teams on deployment pipeline maintenance.
Contribute to the growth of the Site Reliability Engineering practice.
Perform Operational Readiness review and validate ready for implementation
Provide Knowledge Management leadership
Create executive reports on identified team KPI’s monthly
Requirements
Bachelor’s degree in a related field (or 3-5 years of related experience).
Knowledge of ITIL practices and relevant ITIL certifications (e.g., ITIL Foundation) preferred.
Familiarity with service ticket tool, like ServiceNow.
Familiarity with monitoring tools such as Dynatrace (preferred) or other similar platforms.
Understanding of basic development practices, scripting, automation, and monitoring.
Ability to automate tasks using scripting tools.
Knowledge of agile practices is a plus.
Ability to work effectively in a team environment and engage in team activities.
Candidates must possess unrestricted work authorization and not require future sponsorship.
DevOps Lead at Leidos managing platform engineering, SRE, and application security functions. Driving operational excellence and ensuring scalability for federal government applications.
SRE Lead developing scalable cloud - native solutions for mission - critical systems supporting USAF. Managing teams, collaborating with cross - functional units, and ensuring high service reliability standards.
Junior DevOps / Platform Engineer at DieEnergiekoppler GmbH managing AWS/EKS platform operations. Collaborating with team members to improve platform functionalities and security compliance.
DevOps Engineer responsible for AWS infrastructures and backend development at Allguth GmbH. Engaging in greenfield projects with modern solutions in a collaborative team.
Cloud DevOps Specialist responsible for building scalable infrastructure solutions in AWS at SONDA. Focusing on automation, containerization, and data management in a collaborative environment.
DevOps Engineer maintaining and evolving deployment pipelines for Docebo’s AI - powered learning platform. Collaborating with cross - functional teams to ensure efficient software releases and infrastructure management.
DevOps Engineer optimizing CI/CD pipelines for Docebo, an AI - powered learning platform. Involves managing multi - tenant infrastructure using AWS, Docker, and Kubernetes.
DevOps Engineer maintaining and automating infrastructure and CI/CD processes for cybersecurity solutions by NordLayer. Collaborating with teams to ensure performance and scalability of cloud services.
DevOps Engineer maintaining and improving infrastructure and CI/CD processes for cybersecurity solutions provider. Collaborating with cross - functional teams for reliable and scalable cloud solutions.
DevOps Engineer maintaining and automating infrastructure and CI/CD processes at NordLayer. Collaborating with Senior Engineers to implement best practices in a dynamic cybersecurity environment.