Non Production Management Technical Lead position in North America DevOps for GCG applications. Requires strong DevOps, production management, and leadership skills with a focus on consumer applications.
Responsibilities
Provides expertise related to various Distributed Consumer Applications across multiple Lines of Business in North America
Enable Production management processes in non production environment to provide environment stability
Execute robust service readiness
Facilitate standard toolset adoption for all services in the domain
Works as a L2 expert to support the Incident Management, Problem management, risk management and Change management , CI/CD enablement pipeline for SRE function identified
Has Overall accountability of non production stability for his area/domain
Partners with Level 3 support teams to improve resolution rates, efficiency targets, and organizational Service Level Agreements
Performs SRE analysis and remediates identifies issues with the stakeholders and hold them accountable during release signoffs
Partners with SRE enablement and works as SRE eventually to identify the key areas and provides the SRE recommendation from UAT to PERF and PROD for key business transactions supported
Identifies and leads the implementation of Service Automation to reduce cost, reduce risk, improve efficiency and enable Service Management to keep up with the ever-increasing volume of with fast pace of newer technologies
Continually evolve the working practices within and services provided by Production Management to improve efficiency and productivity
Ability to conduct blameless problem management/post-mortem phase of major incidents, develop executive briefings, assess major incident impacts and drive service improvements to prevent repeat of an incident
Create PMR for P1/P2 incidents and close on the actions
Identify the risks, classify them in the non production estate and work with the peers , team members , create Service Improvement plans and drive them to closure
Create Operational readiness documents for major initiatives and provide handover to production team in a seamless manner
Work with SRE team to create a proactive analysis of UAT and PERF view before handing over to production management
Accountable for end to end service health of NAM Core space
Overall accountable for patching , changes, Infra changes, certificates and other KTLO activities in his domain assigned
Overall accountability of the monitoring and its usage by its stakeholders
Work with the monitoring team for setup and overall accountability
Represent DevOps team in various digital forums and facilitate generate of reports and presentations
Be proficient in various technologies of OSE, Apigee, AWS and other new age technologies
Adopt automation laid down by Production management automation and AIOps
Support and Achieve successful internal audits
Requirements
8+ years development or production support experience with North America Consumer applications
Experience or familiarity Cloud Technology is a plus
Solid ITIL Foundation understanding
Engineering Background in system admin, development, DevOps or equivalent field, preferably with experience in Distributed Consumer applications
Experience/ familiarity with automation technologies, advanced analytics and predictive modelling
Ability to develop and manage relationships at all levels
Experience with databases i.e. Oracle, DB2
Experience in programming in one of the following languages unix shell scripting, Java, etc.
Competent with cloud concepts i.e. API, web services and microservices
DevOps Engineer managing AWS infrastructure and deployment pipelines at Harver, an industry - leading hiring solution provider. Collaborating on automation and security across cloud services and application performance.
DevOps Engineer managing cloud infrastructure and CI/CD pipelines at Clear Labs. Modernizing CI/CD processes and ensuring compliance for a hybrid edge - to - cloud stack.
DevOps Engineer supporting clients in modernizing IT infrastructures at Booz Allen Hamilton. Collaborating with cloud teams on developing and managing cloud solutions with innovative tech.
Staff Site Reliability Engineer leading reliability and infrastructure strategy at Flowcode. Collaborating with teams to ensure scalable systems for continued growth in a hybrid work environment.
Support critical AI and DevOps platforms at Citi, contributing to global finance solutions. Collaborate with engineering teams and enhance platform stability and support processes.
AI Reliability Engineer ensuring high quality of AI agent platforms for hospitality industry. Involve in observability, cloud infrastructure management, and CI/CD processes.
DevOps Engineer within a technology - focused team improving and maintaining AWS cloud - based solutions. Collaborating in the evolution and performance optimization of Flexion's platform.
Senior DevOps Engineer ensuring reliable and automated GCP - native infrastructure at Search Atlas. Collaborating across teams to enhance observability and streamline deployment processes.
DevOps Engineer securing and industrializing cloud environments for SaaS and cybersecurity challenges at YONI. Join a dynamic team to optimize infrastructure and processes.
DevSecOps Specialist with Azure focus at Iver, enhancing IT solutions for Nordic customers. Join a team driving innovation and security in cloud services.