Staff Site Reliability Engineer at Zefr | Hybrid Hired

About the role

Site Reliability Engineer at Zefr applying cloud infrastructure expertise, collaborating on ML applications and fostering DevOps culture. Building scalable systems for responsible marketing in social environments.

Responsibilities

Support and build systems and tools that enable other engineers to generate, deploy, and manage product features and models both quickly and safely.
Deploy and support a multi-cloud, micro-service architecture, including infrastructure tailored for ML workloads, deployed via Github Actions, ArgoCD & Kubernetes.
Collaborate with other engineers, particularly the Machine Learning team, to architect secure, resilient, scalable, and cost-efficient applications and ML systems/pipelines in AWS and GCP.
Foster and push our DevOps culture and philosophy by encouraging continuous improvement across all engineering teams.
Proactively maintain the health of production environments, including monitoring application performance and resource utilization.
Participate in 24/7 on-call rotation, respond to system performance issues and outages.
Debug code at the application and infrastructure level.
Mature our CI/CD workflows and release process.
Maintains a forward-thinking approach, actively researching and proposing new solutions.
Propose and review Engineering Request for Comments (RFC) to drive Engineering architecture and practices.

Requirements

7+ year job history designing, managing, deploying, and supporting Cloud Infrastructure in a production environment using major public cloud providers (GCP experience a huge bonus)
Knowledge of GitOps including an understanding of modern CI/CD pipelines, techniques and technologies (Github Actions, GitLab, CircleCI, Argo CD, Flux)
Proficiency with IaC and configuration management tools (Terraform, Terragrunt, OpenTofu, Crossplane, Pulumi)
Production experience architecting, managing, deploying, and supporting container based workloads into Kubernetes clusters
Strong problem-solving experience, focusing on automation
Proven track record of building and scaling reliability practices, including SLO/SLI frameworks, incident management, and capacity planning.
Heavy Production experience with observability platforms and practices (Prometheus, Grafana, Chronosphere, Datadog, OpenTelemetry); ability to design monitoring strategies for complex distributed systems.
Knowledge of cloud networking (Mesh, NAT, Load Balancers, API Gateways, proxies, etc), cloud security, and cost optimization strategies.
Strong written and verbal communication, organization, and documentation skills

Benefits

Flexible PTO
Medical, dental, and vision insurance with FSA options
Company-paid life insurance
Paid parental leave
401(k) with company match
Professional development opportunities
10+ paid holidays off
Summer Fridays (we leave early)
In-office, hybrid, and fully-remote work options available
In-office lunches and lots of free food
Optional in-person and virtual events (we like to celebrate!)

Similar roles

Browse all Devops Engineer jobs

35 minutes ago

CL

DevOps Engineer

Clementine.fr

Ingénieur DevOps administrant et optimisant l’infrastructure AWS pour Clementine.fr, un cabinet d’expertise comptable innovant. Gérant le déploiement CI/CD et garantissant la cybersécurité.

Onsite Role

Laxou France Devops Engineer

54 minutes ago

BO

Mid-Level Cloud DevOps Developer

Boeing

Mid - Level Cloud DevOps Developer at Boeing contributing to the design and maintenance of classified cloud platforms. Collaborating with cloud architects and ensuring compliance with security standards.

Hybrid Role

Hazelwood United States Devops Engineer

$112,200 - $162,150 per year

5 hours ago

ON

DevOps Engineer, Linux Administrator

OneQrew

DevOps Engineer ensuring the efficient operation of Linux infrastructure for scireum GmbH. Involves automation, optimization, and administration of Linux servers while collaborating with various teams.

Hybrid Role

Stuttgart Germany Devops Engineer

5 hours ago

XA

Founder's Associate – Focus on Consulting, DevOps

XALT

Founder's Associate role focusing on consulting and DevOps at XALT Business Consulting. Collaborate with leadership on strategic decisions and project execution in a dynamic team environment.

Hybrid Role

München Germany Devops Engineer

6 hours ago

WI

DevOps Engineer – OTC Trading Platform

Wintermute

DevOps Engineer managing OTC trading platform stability and supporting counterparties for Wintermute. Collaborating with global teams to enhance trading operations and performance during US market hours.

Hybrid Role

New York City United States Devops Engineer

6 hours ago

PF

Reliability Engineer

Premier Foods

Senior Reliability Engineer responsible for determining the maintenance strategy and optimizing reliability of equipment in a food manufacturing company. Involves conducting inspections and strategic planning.

Onsite Role

Ashford United Kingdom Devops Engineer

£55,000 - £57,000 per year

7 hours ago

GG

DevOps Engineer

GRÜN raw GmbH

DevOps Engineer at GRÜN Software Group managing and automating stable infrastructures for digital transformation. Collaborating with developers and IT to improve deployment processes.

Hybrid Role

Kreuzau Germany Devops Engineer

8 hours ago

PG

Software Engineer – DevOps

PLATH Corporation GmbH

Software Engineer focusing on DevOps at PROCITEC, developing advanced software solutions for signal processing. Engaging in automation, security, and integration of CI/CD processes.

Onsite Role

Pforzheim Germany Devops Engineer

8 hours ago

GI

DevOps – Systems Engineer

Groupe Interway

DevOps Engineer managing and optimizing Linux and Windows Server environments at INTERWAY. Involves automation, CI/CD pipelines, and infrastructure management for technological solutions.

Onsite Role

Lyon France Devops Engineer

€45,000 - €48,000 per year

9 hours ago

PG

Software Engineer – DevOps

PROCITEC GmbH

Software Engineer specializing in DevOps for PROCITEC's signal processing software. Responsibilities include CI/CD administration, software development, and security implementations in line with EU regulations.

Onsite Role

Pforzheim Germany Devops Engineer