Site Reliability Engineer at Writer | Hybrid Hired

About the role

Site reliability engineer ensuring 24/7 availability of AI-powered workflows at WRITER. Developing and automating robust platforms for high-traffic AI demands.

Responsibilities

Automate operational tasks and infrastructure management by developing robust tools and platforms using Python, Go, or similar languages, significantly reducing manual toil across our production environment
Design and implement scalable, fault-tolerant infrastructure solutions on public cloud providers (AWS, GCP, Azure) to support WRITER's rapidly expanding, high-traffic AI platform
Own the reliability, performance, and efficiency of WRITER’s core services, defining and upholding stringent Service Level Objectives (SLOs) and Error Budgets
Own the observability stack for monitoring, logging, and alerting systems to ensure rapid detection of issues across our complex distributed systems
Lead incident response, post-mortems, and root cause analyses, applying learnings to proactively prevent future outages and build a more resilient system architecture
Collaborate closely with product and engineering teams, providing expert guidance on system design for reliability, performance, and scalability from conception through launch

Requirements

A solid 7+ years of experience in site reliability engineering, DevOps, or a similar role focused on building and operating large-scale, high-availability production systems
Deep expertise with cloud platforms (AWS strongly preferred), containerization technologies like Docker and Kubernetes, and Infrastructure-as-Code tools such as Terraform
Strong proficiency in programming languages such as Python, Java, Go for automation and monitoring
Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) to maintain system health and performance
Demonstrated ability to Challenge the status quo, proactively identify systemic weaknesses, and propose innovative solutions to complex reliability problems
Excellent communication, collaboration, and problem-solving skills, with a talent for building strong relationships and Connecting with cross-functional teams
A strong sense of ownership and accountability, eager to Own mission-critical systems and drive them toward peak performance and unparalleled reliability

Benefits

Generous PTO, plus company holidays
Comprehensive medical and dental insurance
Paid parental leave for all parents (12 weeks)
Fertility and family planning support
Early-detection cancer testing through Galleri
Competitive pension scheme and company contribution
Annual work-life stipends for:
Wellness stipend for gym, massage/chiropractor, personal training, etc.
Learning and development stipend
Company-wide off-sites and team off-sites
Competitive compensation and company stock options

Similar roles

Browse all Devops Engineer jobs

2 hours ago

SA

Maintenance Mechanical/Reliability Engineer

SABIC

Mechanical/Reliability Engineer responsible for mechanical installations in Bergen op Zoom. Analyzing maintenance strategies and leading projects to enhance reliability.

Onsite Role

Bergen op Zoom Netherlands Devops Engineer

yesterday

VE

Senior DevOps Engineer

Verizon

Senior DevOps Engineer responsible for cloud infrastructure and deployments. Optimizing AWS services and ensuring system security and reliability for Verizon.

Hybrid Role

Irving United States Devops Engineer

$120,500 - $231,000 per year

yesterday

JO

Senior DevOps Engineer

Jobs2web

Senior DevOps Engineer responsible for automating infrastructure and building CI/CD pipelines for collaborative robotics company. Collaborating with global engineering teams from the Bangalore office.

Onsite Role

Bangalore India Devops Engineer

2 days ago

TE

Site Reliability Engineer Intern

Tencent

Site Reliability Engineer Intern at Tencent working on gaming services and cloud native solutions. Collaborating with global teams to eliminate toil and enhance reliability.

Hybrid Role

Los Angeles United States Devops Engineer

$27 - $52 per hour

2 days ago

N5

Cloud/DevOps Specialist – Pre-Trade Squad

N5X

Cloud/DevOps Specialist at N5X managing and optimizing critical cloud infrastructures for Brazilian energy trading. Collaborating with a multidisciplinary team to ensure high availability and performance.

Hybrid Role

São Paulo Brazil Devops Engineer

2 days ago

N5

Cloud/DevOps Specialist – Trade Squad

N5X

Cloud/Devops Specialist responsible for designing a hybrid architecture combining cloud and on - premises infrastructure for energy trading systems. Collaborating with a multidisciplinary team in a dynamic environment.

Hybrid Role

São Paulo Brazil Devops Engineer

2 days ago

EN

Reliability Engineering Specialist

Enbridge

Reliability Engineering Specialist utilizing reliability tools and models to improve asset performance at Enbridge. Collaborating across teams to guide investment decisions for safe operations.

Hybrid Role

Edmonton Canada Devops Engineer

2 days ago

MT

Senior DevOps Specialist

Magnum Tires

DevOps Engineer responsible for structuring and supporting cloud DevOps architecture in Brazil. Working strategically on automation and CI/CD practices with development teams in Pernambuco.

Hybrid Role

Recife Brazil Devops Engineer

2 days ago

BO

DevSecOps Software Engineer – Experienced/Senior

Boeing

DevSecOps Software Engineer developing secure CI/CD pipelines for Boeing's military software systems. Collaborate with cross - functional teams and implement automation and security best practices.

Onsite Role

Hazelwood United States Devops Engineer

$112,200 - $185,150 per year

3 days ago

LE

DevOps Manager – USAF Cloud One

Leidos

DevOps Manager responsible for managing a team for multi - cloud solutions supporting the USAF Cloud One project. Focus on scalable cloud - native solutions and CI/CD practices.

Hybrid Role

United States Devops Engineer

$131,300 - $237,350 per year