Lead and grow a team of SREs to build automated, resilient infrastructure and reliable deployment pipelines.
Collaborate with development teams, oversee incident response and drive DevOps best practices across the organization.
Responsibilities
Manage and lead a team of 6 passionate SREs
Implement tools and processes for deployment and industrialization (CI/CD, blue/green, canary, rollback, etc.)
Automate provisioning of a resilient infrastructure that meets product needs
Work with development teams to facilitate regular releases
Maintain services in operational condition; analyze and resolve performance and scalability issues (including load testing) for current and historical deployments
Oversee the application portfolio in collaboration with the Network Operations Center (NOC); manage access and security
Contribute to the evolution of the IT infrastructure (e.g., VMware to KVM migration and service offering) and reduce technical debt
Act as a DevOps advocate and help build a transversal SRE community across the company
Share company information and communicate team activities
Define and maintain a clear, relevant team organization
Develop the team while avoiding micromanagement
Requirements
Minimum 3 years’ experience in a similar role
Proven managerial experience
Knowledge of industrialization processes, agile methodologies, GitFlow and DevOps best practices, with a solid understanding of system administration
Experience maintaining high availability systems
Experience with on-call organization and incident response
Strong Linux skills; Windows knowledge is a plus
Proficiency with Infrastructure-as-Code: Terraform, Ansible
Experience with logging and monitoring: ELK (Elasticsearch, Logstash, Kibana), Prometheus
Hands-on experience with Docker, Kubernetes, Consul, Vault
Experience with messaging systems such as RabbitMQ
Experience with databases such as PostgreSQL, MongoDB, Elasticsearch
Good knowledge of backup and recovery systems
Strong verbal and written English skills
Empathetic and open-minded
Benefits
Dynamic and creative environment within international teams
Wide range of self-learning courses available on our e-learning platform
Opportunities to participate in local and international meetups and conferences
Develops high - automation services in Golang or Java within AWS, Kubernetes, and Azure. Supports teams in building secure applications while working in a hybrid environment.
DevOps Engineer specializing in AWS Cloud Infrastructure in a hybrid position. Collaborating within a supportive team to build modern infrastructure for VM - based applications.
Leading DevOps platform strategy for KIPMI Software's next - generation digital trust products. Collaborating with teams to implement scalable infrastructure and DevSecOps practices.
Join our DevOps team to build and manage GitHub pipelines and cloud - native Azure solutions. Collaborate with teams to drive DevOps best practices and optimize deployments.
Site Reliability Engineer enhancing system reliability and deployment practices at OpenLoop. Collaborating with cross - functional teams for incident management and performance tuning.
Senior DevOps Engineer enhancing Azure application reliability for a healthcare fintech platform. Collaborating closely with engineering teams to ensure deploy safety and observability.
DevOps Engineer contributing to tooling changes and leading a community of practice at Totara. Focused on collaboration, development, and support for internal teams.
Site Reliability Engineer responsible for infrastructure supporting AI platform. Safeguarding US customer data and ensuring compliance in the Aerospace and Defense sector.
Senior Infrastructure Engineer managing Azure platform for a SaaS product at Rillion. Focused on automation, security, reliability, and scalability in a hybrid work environment.