Founding leader of Platform Engineering at Rootly, shaping reliable incident management infrastructure. Building and leading teams to ensure high performance and operational maturity in a fast-growing environment.
Responsibilities
Own the vision, strategy, and roadmap for Rootly’s infrastructure and developer platform
Build and lead a high performing Platform Engineering organization that may include SRE, infrastructure, DevEx, and internal tooling
Establish a culture where reliability, performance, and developer experience are non negotiables
Act like an owner, spotting problems early, mobilizing teams, and driving solutions from concept to completion
Architect a highly available, redundant, and scalable infrastructure foundation
Lead capacity planning, cost management, performance tuning, and long term infrastructure scaling
Drive operational maturity through infrastructure as code, declarative infrastructure, configuration management, and repeatable automation
Enable product engineers to move extremely quickly by optimizing local dev environments, ephemeral cloud environments, fast CI and CD, and reliable canaries
Provide tooling that abstracts infrastructure complexity and removes friction from development
Ensure every engineer can ship confidently, frequently, and safely
Own platform wide SLOs, SLIs, and error budgets and use them to drive prioritization
Oversee observability tooling, monitoring, alerting, and incident response processes
Partner with product engineering teams to ensure services meet reliability and performance goals and to improve runbooks and postmortems
Drive high quality execution with urgency while balancing long term bets with tactical wins
Raise the bar and inspire engineers to think bigger, move faster, and deliver exceptional results
Collaborate closely with Product, Engineering, and leadership to align platform investments with company strategy
Recruit, mentor, and develop top tier platform engineers and create a culture of excellence
Requirements
10+ years in platform, infrastructure, SRE, or DevOps roles, with increasing leadership responsibility
Experience leading platform or SRE teams, including hiring, mentoring, and building culture
Deep expertise with cloud infrastructure, AWS preferred, distributed systems, scaling, and redundancy
Proven experience designing or operating high scale production systems and delivering operational maturity
Strong background in observability, performance tuning, and scaling strategies
Comfortable writing production grade software to solve infrastructure problems, Ruby or Go is a plus
Strong architectural judgement and systems thinking that anticipates scaling pain before it becomes real.
Benefits
Competitive compensation and early equity in a fast-growing, venture-backed company.
Comprehensive medical, dental, and vision coverage.
3 weeks of vacation, plus unlimited sick and mental health days, and a company-wide end-of-year shutdown to recharge.
$500 stipend for home office setup.
A fast-moving, high-impact environment where your leadership and ideas directly shape the future of the company.
Senior Platform Engineer at Rootly building infrastructure for incident management and enhancing system reliability. Collaborate with product teams to drive performance and scalability of services in a high - impact environment.
Cloud Platform Engineer developing software solutions for federal government customers. Working with Cloud technology (AWS), Go, and Linux in a hybrid environment.
Platform Engineer specialized in IT Infrastructure within Private Cloud, responsible for maintaining infrastructure and collaborating with teams to ensure reliability and security.
Senior Platform Engineer managing cloud infrastructure and application code delivery for a financial services company. Leading automation and security through CI/CD pipelines and IaC.
Container Platform Engineer managing and evolving container platforms for leading logistics software company. Collaborating closely with management to ensure stable and secure cloud and container services.
Senior Container Platform Engineer at LexisNexis leading design, build, and optimization of cloud - native application platforms with AWS and Kubernetes. Collaborating with DevOps, SRE, and Security teams for scalable, secure solutions.
Product Reliability Engineer ensuring availability and scalability for Kraken's energy management platform. Collaborate with diverse teams to enhance system performance and reliability.
Platform Engineer working on enterprise data and analytics products for the DoD. Delivering scalable, production - ready solutions with a focus on mission impact.
Platform Engineer at Avanade working on Azure platform foundations, infrastructure changes, and CI/CD pipelines. Collaborating with senior engineers and architects to deliver impactful solutions.
Dialer Platform Engineer at OneMain Financial configuring outbound campaigns across systems. Collaborating with Strategy, Compliance, and IT to ensure effective dialing performance.