Understand the business drivers and analytical use-cases.
Addresses area-level risks, provides and implements mitigation plan.
Reports about area readiness/quality, and raise red flags in crisis situations.
Monitor production environment taking a complete view of system health; Track and produce metrics for the team and develop strategies to increase efficiency.
Maintain software and systems to run the applications.
Improve reliability, quality, and time-to-market of our suite of software solutions.
Measure and optimize system performance, with an eye toward pushing our capabilities forward, anticipate customer needs, and innovating to continually improve.
Provide primary operational support for multiple large, distributed software applications.
Work with business clients, internal and external teams to debug or tackle applications issues.
Flexible to provide On-call support to resolve issues as the need arises.
Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
Partner with development teams to improve services through thorough testing and release procedures.
Participate in system design consulting, platform management, and planning.
Create sustainable systems and services through automation and uplifts.
Balance feature development speed and reliability with well-defined service level objectives.
Requirements
Bachelor’s degree or equivalent experience in computer science or other technical field.
Minimum 3 to 5+ year of relevant technical experience.
Ability to problem solve to identify and resolve root cause.
Ability to program (structured and OO) with one or more high level languages, such as React, Node.js, JavaScript or similar.
Experience with one or more of the following: Scheduling (CA, CA WLA), SQL queries and scripting, Excel, Informatica development (or related ETL tools), Shell Scripting/ Power Shell/UNIX, Windows/ Batch Scripting.
Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn).
Experience with Agile (Scrum or Kanban), Jira and ServiceNow.
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
Benefits
Manulife offers eligible employees a wide array of customizable benefits, including health, dental, mental health, vision, short- and long-term disability, life and AD&D insurance coverage, adoption/surrogacy and wellness benefits, and employee/family assistance plans.
Various retirement savings plans (including pension and a global share ownership plan with employer matching contributions) and financial education and counseling resources.
Opportunity to participate in incentive programs and earn incentive compensation tied to business and individual performance.
Generous paid time off program in Canada includes holidays, vacation, personal, and sick days, and the full range of statutory leaves of absence.
Flexible environment where well-being and inclusion are emphasized.
Support for learning and career growth.
Global team events and recognition (awards mentioned).
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.
Platform Engineer focusing on supporting CI/CD pipelines and Kubernetes at PCCW. Responsible for ensuring platform services' reliability and performance, with night - time support as needed.
Site Reliability Engineer at Bumble optimizing large - scale Linux environments and ensuring system stability. Focusing on troubleshooting, incident recovery, and performance tuning in complex infrastructures.
Senior DevOps Manager overseeing CI/CD processes for NVIDIA Networking products. Leading a team and collaborating with global teams to enhance R&D efficiency and infrastructure.
DevOps Manager overseeing engineering team developing scalable CI/CD processes for NVIDIA Networking products. Enhancing global R&D efficiency in a technology - focused company.
Join Operations Team as Senior Site Reliability Engineer driving operational excellence for cybersecurity solutions. Collaborate across teams to manage production platforms and optimize infrastructure.
Software Developer - DevOps System Administrator working within the SCMT team to enhance software application efficiency. Collaborating on tools and scripts for application lifecycle management.
DevOps Engineer managing CI/CD pipelines and Kubernetes deployments at Stefanini. Collaborating with teams to optimize application health and deployment processes.
DevOps Engineer working with development teams for seamless feature integration and deployment automation. Focus on CI/CD pipelines, monitoring solutions, and continuous process optimization.