Senior Site Reliability Engineer at Salla leading reliability initiatives and ensuring platform performance. Handling incidents and mentoring engineers in building resilient systems.
Responsibilities
Lead reliability initiatives, handle complex incidents, improve platform performance, and guide engineering teams toward building resilient systems.
Participate in the **on-call rotation** as part of our commitment to platform reliability.
Troubleshoot complex issues across applications, infrastructure, and networks.
Identify and resolve performance bottlenecks and scaling challenges.
Enhance cloud-native infrastructure, deployment processes, and automation.
Build and refine dashboards, alerts, metrics, logs, and traces.
Develop tools that reduce operational toil and increase reliability.
Mentor engineers on reliability, debugging, and operational best practices.
Requirements
Strong experience with **Kubernetes**, **service mesh technologies**, and cloud platforms (AWS/GCP/Azure).
Deep understanding of **Linux**, networking, distributed systems, and load balancers.
Hands-on with **Terraform** or similar IaC tools.
Experience with **Prometheus**, **Grafana**, **Loki**, **Mimir**, **Elastic**, or similar observability tools.
Proficiency in scripting/programming (Bash, Python, Go).
Experience with CI/CD and GitOps.
Strong debugging, incident response, and performance analysis skills.
DevSecOps Engineer supporting AI - enabled financial compliance initiative for the Department of War. Responsible for designing secure infrastructure and collaborating with cross - disciplinary teams.
(Senior) DevOps Engineer with a focus on CI/CD and cloud infrastructure management for e - commerce solutions. Collaborating across teams to ensure automated, scalable deployments.
Senior DevOps Engineer managing monitoring systems for B2B e - commerce platforms in Azure Cloud. Collaborating with teams to improve platform products and processes.
DevSecOps Expert managing deployments, monitoring systems, and providing technical support in Brussels. Role involves close collaboration with Development and IT teams at a major client's site.
DevOps Engineer at Gemba designing secure, cloud - native platforms for public - sector organizations. Leading technical decisions and collaborating to solve complex challenges for critical systems.
DevOps Engineer automating cloud - native infrastructure for public - sector organizations. Join an agile team to enhance deployment processes and support critical systems.
DevOps Engineer designing and constructing secure cloud - native platforms for public - sector organizations across the UK. Leading technical decisions while collaborating closely with clients.
Junior and DevOps Engineers designing and running secure cloud - native platforms for UK public - sector organisations. Collaborating with teams to streamline deployment and automate infrastructure workflows.
Site Reliability Engineer optimizing global trading infrastructure for a crypto capital markets partner. Responsibilities include cloud environment management and system design for high availability.
DevOps Engineer responsible for implementing and operating CI/CD pipelines for SaaS services. Collaborating with teams to ensure reliable and secure operations in the Risk & Fraud business unit.