Site Reliability Engineer at Trainline | Hybrid Hired

About the role

Site Reliability Engineer contributing to platform reliability at Trainline, Europe's leading rail ticketing platform. Collaborating with product engineering to ensure operational readiness and incident response.

Responsibilities

Developing an understanding of system architecture, dependencies, and failure modes across the Trainline platform
Participating in production incident response, supporting investigation, mitigation, communication, and coordinated service restoration
Contributing to post-incident reviews and follow-up actions to improve reliability, scalability, and resilience
Taking part in the SRE on-call rotation
Designing, building, and maintaining observability using metrics, logs, events, and traces to support effective detection and diagnosis
Improving monitoring and alerting by aligning signals to business and customer impact, reducing noise and improving mean time to detection (MTTD)
Ensuring relevant operational data is surfaced quickly and clearly during live incidents
Making informed tooling and technology choices using SRE principles, balancing team and business needs
Supporting AWS-hosted infrastructure and shared platform services using infrastructure-as-code and CI/CD tooling
Collaborating with product engineering teams to ensure services are operationally ready and deployed safely
Advising on reliability and resilience practices
Writing and maintaining reliable, well-structured code and scripts to support reliability and observability goals
Prioritising work effectively and collaborating using agile processes to deliver against team and business goals

Requirements

Experience of SRE concepts such as SLI, SLO and error budgets.
Hands-on experience with observability tooling such as New Relic, Elastic (ELK Stack), Influx, Grafana or similar
Experience working with cloud providers (preferably AWS).
Experience troubleshooting Linux operating systems.
Experience of scripting in at least one language (preferably Python)
Understanding of load balancing and reverse proxy concepts, upstream config concepts, upstream health checks, worker & data flow concepts.
Application architecture concepts (threading, queuing, readiness checks, health checks, circuit breakers, timeouts, exponential backoff, throttling).
Experience building, maintaining and evolving time series data, retention, cardinality, deviation, moving averages and other functions.
Experience with build, deployment & configuration management tooling such as GitHub Actions and Terraform.

Benefits

private healthcare & dental insurance
generous work from abroad policy
2-for-1 share purchase plans
EV Scheme to reduce carbon emissions
extra festive time off
excellent family-friendly benefits
clear career paths
transparent pay bands
personal learning budgets
regular learning days

Similar roles

Browse all Devops Engineer jobs

2 hours ago

EV

DevOps Engineer III

Everseen

DevOps Engineer III providing L3 support for Operations across Edge/on - prem and cloud environments. Building automations and handling incidents for customer deployments.

Hybrid Role

Dulles United States Devops Engineer

2 hours ago

PY

Senior Site Reliability Engineer

Pylon

SRE leading reliability and operational excellence at a mortgage tech platform. Designing systems, tooling, and processes for managing Pylon's production systems in Palo Alto.

Hybrid Role

Palo Alto United States Devops Engineer

$140,000 - $220,000 per year

3 hours ago

GI

Build and Release Engineer

GXO Logistics, Inc.

Senior Build & Release Engineer at GXO Logistics responsible for CI/CD solutions and build automation across various environments. Collaborating with teams for smooth software deployments and mentoring staff.

Hybrid Role

High Point United States Devops Engineer

$140,587 per year

5 hours ago

AC

Senior Site Reliability Engineer

Acuity

Senior Site Reliability Engineer improving the reliability of Acuity’s cloud services. Collaborating across teams to define observability standards and incident response in Cork Digital Centre of Excellence.

Onsite Role

Cork Ireland Devops Engineer

5 hours ago

BO

Experienced DevOps Developer – Site Reliability

Boeing

Azure Senior DevOps Engineer supporting critical cloud systems in the Azure Government Cloud environment. Leading CI/CD pipeline design and implementation with operational best practices.

Hybrid Role

Bangalore India Devops Engineer

6 hours ago

IN

Automation Engineer

Inforca

Automation Engineer enhancing infrastructure and automating operations for client systems. Working in a complex environment oriented towards automation, security, and performance.

Hybrid Role

Monaco Monaco Devops Engineer

€45,000 - €50,000 per year

8 hours ago

GA

Graduate Reliability Engineer

GKN Aerospace

Graduate Reliability Engineer at GKN Aerospace enhancing operational excellence through data analysis and project participation within large structural assemblies.

Onsite Role

Bristol United Kingdom Devops Engineer

8 hours ago

SL

Site Reliability Engineer

Stefanini LATAM

Senior Data & Site Reliability Engineer at Stefanini ensuring the reliability and operation of data platforms and analytical services.

Hybrid Role

Bogotá, D.C.Colombia Devops Engineer

8 hours ago

WR

Staff Site Reliability Engineer

Writer

Site Reliability Engineer at WRITER, ensuring 24/7 availability and performance of AI - powered workflows. Collaborating on scalable infrastructure solutions while impacting enterprise customer trust.

Hybrid Role

New York City United States Devops Engineer

$157,700 - $277,800 per year

10 hours ago

TT

Senior Site Reliability Engineer

Trading Technologies

Engineer at Trading Technologies improving platform stability through coding and automation. Focus on building advanced monitoring tools for global trading operations.

Hybrid Role

Prague Czechia Devops Engineer