Manager of Cloud Operations leading SRE practices to ensure reliability and scalability of cloud infrastructure on AWS and Azure. Join a growing team at Vendavo, enhancing customer success through efficient cloud operations.
Responsibilities
Lead, mentor, and develop a team of DevOps and SRE engineers.
Implement and promote SRE principles and practices across the organization.
Define and monitor service level objectives (SLOs), service level indicators (SLIs), and service level agreements (SLAs).
Develop and implement incident response and post-mortem processes.
Drive automation of operational tasks and infrastructure management.
Design, implement, and maintain scalable and resilient infrastructure on Azure and/or AWS.
Implement infrastructure-as-code (IaC) using tools like Terraform.
Ensure security and compliance of cloud environments.
Manage CI/CD pipelines for automated deployments.
Implement and maintain comprehensive monitoring and alerting systems. Utilize monitoring tools like Azure Monitor, AWS CloudWatch, Prometheus, Grafana, etc.
Communicate effectively with stakeholders at all levels.
Responsible for hiring the right team for the product
Requirements
Between 12 to 18 years of experience which includes leading SRE teams building highly scalable, secure, efficient, and resilient production systems in AWS and/or Azure.
Proven experience in implementing and managing SRE practices.
Strong understanding of CI/CD pipelines and automation tools.
Proficiency in infrastructure-as-code (IaC) tools (Terraform)
Experience with containerization and orchestration technologies (Docker, Kubernetes).
Strong understanding of networking concepts and protocols.
Experience with monitoring and logging tools (Azure Monitor, CloudWatch, Prometheus, Grafana, ELK stack).
Scripting and programming skills (Python, Bash, etc.).
Experience with various Databases (Oracle, SQLServer, etc.)
Benefits
Professional growth and Development opportunities.
Working within a team of friendly, skilled people where help is always within reach
Flexible working hours
4 recharge days, where the entire company goes on a brief pause in all geographies for 1 day each quarter. This day can be spent in whatever way helps you recharge, to regain energy, and dive back into the next workday
High-end laptop (Dell or Mac)
Competitive pay and bonus
18 vacation days in a year in addition to 15 days Sick Leave/ Casual leave per calendar year.
16 hours of paid volunteer time off per year
Wedding gift and newborn gift allowance for employees.
26 weeks of paid maternity leave and one week of paid paternity leave.
12 wellness leaves for women employees
Health Insurance of up to 7 lacs for self, spouse, 4 dependent children, and parents. 100% of the premium is paid by Vendavo and it covers the employee, spouse, children, and their parents.
Group Term Insurance coverage up to three times of their Annual CTC . Dependents are not covered.
Group Personal Accident coverage up to three times of Annual CTC. Dependents are not covered.
Network Infrastructure Engineer overseeing network architecture and infrastructure for AI data platform. Building solutions to enhance performance, security, and scalability.
DevOps Engineer developing and securing cloud - native container platforms at Booz Allen Hamilton. Supporting deployment strategies and managing resources for effective cloud solutions.
Full Stack DevOps Software Engineer responsible for developing cloud - native applications at 0NLU AG. Collaborating in a DevOps team to deliver software solutions with high automation and quality.
Senior DevOps Consultant in Frankfurt helping clients optimize cloud and data projects through innovative solutions. Collaborating in an agile environment with a focus on continuous learning and development.
Senior DevOps Engineer supporting engineering teams in payment and transaction platforms. Improving CI/CD, deployment automation, platform reliability, and engineering efficiency in international environments.
Mid DevOps Engineer supporting engineering teams delivering payment and transaction platforms at Expleo. Focusing on CI/CD, automation, and operational control in international environments.
Ingénieur Systèmes, DevOps et Sécurité couvrant les outils et l’infrastructure IT pour un groupe international. Collaborant avec le CTO pour l’évolution technique et la gestion des projets.
Staff System Reliability Engineer at Disney building high - quality production systems. Collaborating with engineers to design scalable, cloud - native services and ensuring optimal performance and reliability.
Senior Solutions Deployment Engineer leading digital technologies deployment at Medtronic healthcare facilities. Collaborating with teams globally to enhance manufacturing processes and infrastructure.