Lead SRE ensuring system reliability and scalability at Veepee, a leading e-commerce company. Collaborating with teams to automate processes and support technical challenges.
Responsibilities
Implementing tools and processes for deployment and industrialization (CI/CD, blue/green, canary, rollback, etc.);
Automating provisioning of a resilient infrastructure that meets the needs of products;
Working with development teams to facilitate regular releases;
Maintaining services in operational conditions, analyze and resolve performance and scalability anomalies (load tests) of current and historical deployments;
Supervising the application portfolio in collaboration with the Network Operations Center (NOC), manage access and security;
Participating in the evolution of the IS (VMware migration to KVM and service offer) and the reduction of the technical debt;
Being the evangelist of DevOps’ good practices and participate in the construction of a true transversal SRE community within Veepee.
To share company information & spread out team activity;
To define and run a clear and relevant organization within the team;
To develop the team without doing micromanagement;
Requirements
At least 3 years of experience in a similar function;
Knowledge of industrialization processes, agile methods, gitflow flow and DevOps practices in general and understanding of a system side;
Experience in maintaining high levels of availability;
On-call organization and incident response;
Familiar with Linux (good knowledge), knowledge of Windows would be a plus;
Proficiency with IaC: Packer, Terraform, Ansible, Puppet;
SUP: Icinga, ELK, Prometheus;
Hands on with Docker, Kubernetes, Nomad, Consul.
Proficiency with different types of DB such as, PostgreSQL, MongoDB, ElasticSearch;
You have strong verbal and written English language skills.
Empathetic and open-minded
Benefits
Dynamic and creative environment within international teams
The variety of self-education courses on our e-learning platform
The participation in meetups and conferences locally and internationally
DevOps Manager responsible for managing a team for multi - cloud solutions supporting the USAF Cloud One project. Focus on scalable cloud - native solutions and CI/CD practices.
Lead Site Reliability Engineer overseeing SRE practices across Azure and GCP platforms. Driving reliability improvements and leading a team at Lloyds Banking Group.
DevOps Engineer responsible for managing Microsoft Intune operations at Bundesdruckerei GmbH. Focused on ensuring secure digital solutions for identity and data protection in Berlin.
Senior Site Reliability Engineer driving observability and reliability for business - critical systems at Incedo. Collaborating with engineering teams to enhance system resilience and performance.
DevSecOps Specialist securing the software development lifecycle at Vanguard. Collaborating with teams to improve application security tooling and processes, and provide development guidance.
Site Reliability Engineer automating infrastructure deployment for Scaleway's sovereign cloud products. Collaborating with product teams to enhance observability and reliability of the platform.
Reliability Engineer responsible for equipment reliability and safety using data - driven analysis for Wood in Aberdeen. Focus on proactive maintenance and operational efficiency.
Principal Safety and Reliability Engineer developing and supporting safety design for mission - critical aerospace systems. Engaging in design reviews and ensuring compliance with requirements.
Cloud DevOps Engineer playing a pivotal role in developing migration plans for Coast Guard Cloud Architecture. Collaborating with teams to ensure effectiveness and best practices in cloud implementation.