Site Reliability Engineer ensuring system availability and performance for ADEO’s tech operations. Collaborating with teams on SRE practices and implementing monitoring solutions.
Responsibilities
Drive data quality related to operations on our product repositories
Manage and evolve SLI/SLOs for the entire GTDP
Implement and manage the Error Budget Policy process for the GTDP
Implement and manage CUJs (Critical User Journeys) for the GTDP
Coordinate decommissioning of obsolete products, servers, and APIs
Anticipate and manage technical debt (OS versions, DBMS, etc.)
Coordinate implementation of patches and security updates for systems
Support teams on monitoring, observability, and infrastructure-as-code topics
Ensure access to and analyzability of platform logs
Implement use cases around AI for Ops (e.g., predictive analysis of incidents)
Requirements
Proven experience as an SRE or Ops Engineer, or in a similar role within a technology environment
Bachelor's or Master's degree (Bac+3 to Bac+5) in Computer Science, Information Systems, or equivalent
Demonstrated experience in IT operations, DevOps, or SRE, ideally in a technical environment
Strong understanding of SRE concepts: SLI/SLO, Error Budget Policy, CUJ, Toil Management, etc.
Experience with monitoring solutions such as Prometheus, Grafana, or Datadog
Proficient with automation and CI/CD tools (Ansible, Terraform, etc.)
Apply — and challenge — architecture, security, and performance standards
Committed to service quality and system reliability
Enjoy working cross-functionally with multiple teams and stakeholders
Comfortable collaborating in an international environment; technical English is not a barrier
Benefits
A stimulating environment that encourages initiative and an entrepreneurial mindset
Role-specific training to develop your skills
Career growth and internal mobility opportunities within an international group
Quarterly team bonuses and the opportunity to become a shareholder
Flexible remote work policy
Support for sustainable commuting: contributions toward purchasing bikes and e-scooters, plus a carpooling allowance
DevSecOps Engineer architecting CI/CD framework services for Truist, enhancing the flow of business value through DevSecOps practices. Building and maintaining automation for software delivery and operations.
Application Security Manager at Evertec, handling security strategy and implementation in financial tech. Leading efforts in Application Security, DevSecOps, and compliance with financial regulations.
Databricks Senior DevOps Engineer designing and operating platforms on AWS and Databricks for Financial Crime. Focused on platform infrastructure, governance, security, and operations.
Site Reliability Engineer at Assecor, focusing on SLIs, SLOs, and incident management. Enhancing performance and reliability through observability and automation in a hybrid work environment.
DevOps Architect at Ascensus, responsible for technical direction and oversight for application engineering practices across scrum teams. Promotes DevOps culture and innovative solutions.
Cloud Site Reliability Engineer ensuring scalability, performance, and reliability of cloud infrastructure deployed in Woven City. Working with product owners and teams for innovative solutions.
Senior DevOps Engineer supporting enterprise - grade Kubernetes infrastructure and CI/CD automation for U.S. Army projects. Engaging in critical system designs and automation processes with a focus on cloud - based platforms.
Reliability Engineer focusing on mechanical systems in a long - standing Australian FMCG company. Ensure ongoing reliability improvements and support plant operations for iconic cereal production.
Software Engineer 2 developing full - stack solutions for U.S. Bank. Collaborating with teams to design and maintain best in class software experiences.
Principal Software Engineer at FIS driving reliability and performance in fintech environments. Collaborating across teams for high - scale, high - reliability solutions in the finance sector.