SRE Team Lead in charge of reliability strategy and operational maturity for a cybersecurity SaaS platform. Leading a specialized team to enhance system performance and incident management.
Responsibilities
Leading and mentoring an experienced SRE team while defining and driving the reliability strategy of a production-grade cybersecurity SaaS platform.
Designing and evolving multi-cluster Kubernetes environments across cloud providers while owning availability, performance, and incident management processes.
Establishing and enforcing SLOs/SLAs, error budgets, and production standards.
Driving infrastructure as code and automation standards (Terraform, CI/CD) while improving observability, monitoring, and operational visibility across the system.
Performing and lead root cause analysis for complex production incidents
Partnering with R&D, Security, and Product to align reliability with rapid delivery
Shaping architectural decisions at the platform level.
Requirements
Have 2+ years leading or managing infrastructure/SRE teams.
Have solid hands-on Kubernetes production experience
Have experience operating cloud environments (GCP, AWS, Azure, or similar) with a good understanding of reliability engineering principles (SLOs, SLAs, error budgets).
Have experience with infrastructure as code and automation (Terraform, Ansible).
Have Software engineering experience with exceptional Linux and networking troubleshooting skills.
Have proven experience handling production incidents and conducting root cause analysis. Ability to drive technical standards across teams.
Demonstrate clear and structured communication skills.
Junior DevOps Engineer implementing continuous integration and deployment architecture for the Defense Logistics Agency. Debugging cluster - based computing while using various configuration management tools.
Mobile DevOps Engineer developing hybrid applications with Ionic for a global organization. Collaborate across teams to optimize development practices and maintain mobile environment.
Site Reliability Engineer improving system reliability and performance in production environments with a focus on automation and operational efficiency. Collaborating with engineering and infrastructure teams on deliverable - focused projects.
Lead Virtualisation Engineer at Mastercard focused on service quality and performance of platform virtualisation technologies. Collaborate with teams to ensure availability, scalability, and resilience across the network in Singapore.
Senior DevOps Engineer in a technology consulting firm connecting tech talents to impactful projects. Involves working in healthy environments with growth opportunities.
DevOps Engineer at LRQA, optimizing deployments and driving process improvements in a global assurance provider. Focusing on CI/CD pipelines, security best practices, and team collaboration.
DevOps Engineer at Booz Allen enhancing critical systems for space operations. Modernizing architectures and collaborating with teams to solve complex challenges.
DevOps Developer managing cloud infrastructure and CI/CD pipelines for Volkswagen Group Services. Collaborating with teams to ensure stable and efficient software deployments in a hybrid work environment.
Analista Devops Pleno at Finnet managing cloud and infrastructure projects for client solutions. Involves architecture design, systems management, and team collaboration.