SaaS SRE lead ensuring service reliability and quality for Nokia's SaaS business. Leading technical initiatives and enhancing operational processes for cloud-native services.
Responsibilities
Recover, restore, and build self-recovery capability for cloud-native services and components (AWS, GCP and Azure)
Ensure Service Assurance for SaaS applications (use cases) deployed across all public cloud hyperscaler providers that CNS SaaS will have.
L1/L2 Site Reliability Engineering Operations (event & incident management, change management and execution, security and privacy compliance remediation and mitigation)
Provide technical and operational leadership over Agile DevOps practices including documentation, iteration, planning, scheduling, coordinating and executing
Help devise and execute strategies for accomplishing service assurance improvements using creative and cost-effective means and methods
Encourage and foster SRE contribution and input/participation in continuous service improvements – both technical and procedural
Collaborate with team members and peers/partner organizations to determine and define best practices that bring benefits to SRE Operations and the SaaS organization
Work with Product Managers and R&D teams of SaaS applications (use cases) to determine and support service-level agreements (SLAs), service-level indicators (SLIs) and service-level objectives (SLOs)
Requirements
12+ years of operations, support, SRE, DevOps or related experience, Strong communication skills, including ability to create presentations or dashboards.
Experience or familiarity with public cloud native services and components (AWS, GCP, Azure)
Experience or familiarity with DevOps technologies (examples: GitHUB, Terraform/Terragrunt, etc)
Experience or familiarity with Kubernetes and related technologies (docker, helm, k8s API)
Experience or familiarity with Datadog Monitoring tool and ticketing systems like SF.com, ServiceNow, Jira including process and even API integrations
Experience with documentation management using Confluence, SharePoint and MS Teams
Experience in Auto-recovery DevOps for continuous service improvement (Backup & Restore strategy).
Experience in L2/L3/L3 Application support integration with BU and Product teams
Benefits
Flexible and hybrid working schemes
A minimum of 90 days of Maternity and Paternity Leave, with the option to return to work within a year following the birth or adoption of a child (based on eligibility)
Life insurance to all employees to provide peace of mind and financial security
Well-being programs to support your mental and physical health
Opportunities to join and receive support from Nokia Employee Resource Groups (NERGs)
Employee Growth Solutions to support your personalized career & skills development
Diverse pool of Coaches & Mentors to whom you have easy access
A learning environment which promotes personal growth and professional development - for your role and beyond
DevOps Engineer ensuring stability, scalability, and reliability of justtrack's SaaS platform. Collaborate with development teams, manage cloud infrastructure, and enhance CI/CD processes.
Cloud DevOps Engineer designing and optimizing secure cloud infrastructure on Azure. Collaborating closely with developers for reliable CI/CD processes on cloud - based products.
Staff Site Reliability Engineer responsible for cloud infrastructure implementation and reliability improvements at Auror. Collaborating with engineering teams to enhance production code understanding.
Own availability and strive for operational excellence of Sumo Logic’s observability. Collaborate with global SRE team to optimize operations and improve developer velocity.
Senior Executive supporting technology initiatives in Pune, India. Collaborating globally to connect people and solve complex challenges in a sustainable manner.
DevOps Engineer leading the design, implementation, and optimisation of Kubernetes platforms for Vodafone. Collaborating with product teams to streamline operational processes and enhance developer experience.
Senior Site Reliability Engineer developing scalable systems and automation for high - scale projects at Euna Solutions. Collaborating closely with software developers and mentoring junior engineers.
Senior Site Reliability Engineer responsible for designing scalable systems at Euna Solutions. Collaborating with developers and mentoring juniors while driving automation and reliability.
Senior Site Reliability DevOps Specialist at Boeing overseeing GCP cloud environment and infrastructure. Ensuring reliability, scalability, and automation while collaborating with distributed teams.
Lead DevOps Engineer driving modernization and operational excellence for Enterprise Payments at American Family Insurance. Collaborate across teams and enhance payment processing capabilities.