SaaS SRE lead ensuring service reliability and quality for Nokia's SaaS business. Leading technical initiatives and enhancing operational processes for cloud-native services.
Responsibilities
Recover, restore, and build self-recovery capability for cloud-native services and components (AWS, GCP and Azure)
Ensure Service Assurance for SaaS applications (use cases) deployed across all public cloud hyperscaler providers that CNS SaaS will have.
L1/L2 Site Reliability Engineering Operations (event & incident management, change management and execution, security and privacy compliance remediation and mitigation)
Provide technical and operational leadership over Agile DevOps practices including documentation, iteration, planning, scheduling, coordinating and executing
Help devise and execute strategies for accomplishing service assurance improvements using creative and cost-effective means and methods
Encourage and foster SRE contribution and input/participation in continuous service improvements – both technical and procedural
Collaborate with team members and peers/partner organizations to determine and define best practices that bring benefits to SRE Operations and the SaaS organization
Work with Product Managers and R&D teams of SaaS applications (use cases) to determine and support service-level agreements (SLAs), service-level indicators (SLIs) and service-level objectives (SLOs)
Requirements
12+ years of operations, support, SRE, DevOps or related experience, Strong communication skills, including ability to create presentations or dashboards.
Experience or familiarity with public cloud native services and components (AWS, GCP, Azure)
Experience or familiarity with DevOps technologies (examples: GitHUB, Terraform/Terragrunt, etc)
Experience or familiarity with Kubernetes and related technologies (docker, helm, k8s API)
Experience or familiarity with Datadog Monitoring tool and ticketing systems like SF.com, ServiceNow, Jira including process and even API integrations
Experience with documentation management using Confluence, SharePoint and MS Teams
Experience in Auto-recovery DevOps for continuous service improvement (Backup & Restore strategy).
Experience in L2/L3/L3 Application support integration with BU and Product teams
Benefits
Flexible and hybrid working schemes
A minimum of 90 days of Maternity and Paternity Leave, with the option to return to work within a year following the birth or adoption of a child (based on eligibility)
Life insurance to all employees to provide peace of mind and financial security
Well-being programs to support your mental and physical health
Opportunities to join and receive support from Nokia Employee Resource Groups (NERGs)
Employee Growth Solutions to support your personalized career & skills development
Diverse pool of Coaches & Mentors to whom you have easy access
A learning environment which promotes personal growth and professional development - for your role and beyond
Release Engineer managing end - to - end lifecycle of software deployments at CrowdStrike. Focused on building automated release processes that ensure quality across environments.
Mainframe DevOps role focusing on data management and service delivery for Commerzbank. Join a customer - centric team dedicated to a data - driven enterprise.
Senior DevOps Engineer working on CI/CD setup, deployment security, and database maintenance for Bundesdruckerei GmbH. Collaborating on innovative secure digital solutions in Berlin.
Site Reliability Engineer operating on Confluent Cloud for government clients. Ensuring system reliability and compliance with FedRAMP standards in a hybrid working model.
Site Reliability Engineer at Plenful maintaining system performance and reliability. Collaborating with teams to improve operations and ensure system stability in a fast - paced environment.
Senior Site Reliability Engineer at LexisNexis working on cloud data applications and microservices. Collaborating within teams to enhance system reliability and automate recovery processes.
Reliability & Maintenance Engineer for Reckitt focusing on maintenance strategies and equipment optimization. Involves collaboration across production, quality, and maintenance teams to minimize downtime and extend asset life.
Associate SRE ensuring high availability and minimal disruption across business - critical systems through monitoring and automation. Collaborating with teams to boost workflow efficiency in a sustainable energy company.
DevOps Engineer transforming infrastructure to support GovTech solutions. Collaborating with development and test teams on projects, focusing on Infrastructure as Code and CI/CD pipelines.