Senior Analyst for Cloud Ops focused on maintaining critical GCP environments and leading incident resolution. Support improvements in infrastructure and collaboration with multiple teams.
Responsibilities
Serve as a specialist in the support and evolution of critical GCP environments;
Lead the resolution of complex and critical incidents, including cross-service scenarios;
Define mitigation and incident recovery strategies;
Conduct RCA and postmortems for P1 and P2 incidents, ensuring preventive and corrective action plans;
Propose and implement improvements to architecture, capacity, performance, and reliability;
Develop automations and infrastructure-as-code (IaC) using Terraform to reduce repetitive toil;
Support cost optimization initiatives in partnership with FinOps, including periodic analyses and presentations;
Participate in the planning and execution of the annual DR (disaster recovery) exercise;
Support migration initiatives and the evolution of workloads to GCP;
Provide technical mentoring for N1 and N2 teams, promoting standardization of runbooks, evidence, and operational best practices.
Requirements
Bachelor's degree (completed)
Advanced experience in operation, support, or reliability of production and critical GCP environments
Advanced troubleshooting skills, including log and metric analysis
Strong knowledge of networking and managed GCP services (Compute Engine, Cloud Storage, Cloud SQL, BigQuery, Cloud Run/Cloud Functions, Pub/Sub)
Experience with automation, IaC (Terraform), and continuous improvement practices
Ability to act as a technical reference, coordinating with multiple teams and vendors
Technical English (preferred)
Google Cloud certifications – Professional Cloud Architect or Professional Cloud DevOps Engineer (preferred)
Experience with Kubernetes/GKE (preferred)
Knowledge of Service Mesh (preferred)
Experience with APIs (Apigee) (preferred)
Experience with FinOps, cloud cost control and governance (preferred)
Corporate Excellence Operational Specialist focusing on continuous improvement methodologies in manufacturing. Ensuring consistent application of WCM principles and training teams in structured problem - solving.
Working Student supporting Tech & Operations at Lumeus, a mental health startup. Collaborate on website maintenance, customer inquiries, and product integration.
Director of Operations driving performance and scalable systems across healthcare and housing programs. Leading initiatives to improve quality and efficiency in a mission - driven organization serving vulnerable populations.
Junior Project Manager at FC Viktoria Berlin, focusing on matchday operations and club infrastructure. Involved in event planning and team support in a dynamic sports environment.
Operations Manager managing plasma collection center operations ensuring compliance and donor safety. Leading frontline staff and maximizing production efficiencies.
Operations Supervisor responsible for logistics planning and management at DSV. Ensuring productivity and service goals are achieved while supervising warehouse activities and staff in Olive Branch.
Retail Operations Specialist responsible for operational support and systems troubleshooting in retail settings for Sheet Society. Fostering collaboration and continuous improvement across teams in a hybrid work environment.
Operations Manager at Trace, facilitating real - world data capture for physical AI. Coordinating logistics, training, and troubleshooting during operational launches with partners.