Staff Site Reliability Engineer responsible for cloud infrastructure implementation and reliability improvements at Auror. Collaborating with engineering teams to enhance production code understanding.
Responsibilities
Implementing changes to cloud infrastructure (across Azure and Google Cloud Platform)
Understanding and improving reliability, cost and scale
Partnering with engineering streams to ensure they understand how their code is running in production
Setting technical standards, driving best practices across teams and influencing architectural decisions
Managing incidents with Rootly
Analysing observability signals with Honeycomb and SumoLogic
Managing container-based workloads (using Cloud Run, Azure Container Apps, AKS and GKE)
Managing routing, security and workers in Cloudflare
Managing Infrastructure with Terraform
Driving observability strategy, including OpenTelemetry adoption and standardisation across .NET services and cloud infrastructure
Designing, implementing and maintaining cloud infrastructure and services (across Google Cloud Platform and Azure) using Terraform
Building and improving CI/CD pipelines using GitHub Actions and Octopus Deploy
Mentoring and coaching other engineers
Participating in incident management, including post-mortems
Influencing the product roadmap and working with engineering and product counterparts to improve resiliency, reliability and scalability
Proactively identifying scalability and reliability issues within systems
Participating in a shared on-call rotation (1 out of every 4 weeks)
Requirements
Infrastructure as Code (Terraform) - You bring strong, hands-on Terraform experience in production environments. You’re comfortable designing and maintaining reusable modules, managing state safely, troubleshooting complex issues, and evolving infrastructure using best practices. You can contribute immediately without needing to ramp up on alternative IaC tools.
.NET Proficiency - You have solid working experience with .NET systems and a good understanding of tracing and observability concepts. You’re confident navigating existing codebases, making thoughtful changes, and testing them appropriately. While you don’t need to be a deep specialist, you can reason clearly about application behaviour and system impact.
Multi-Cloud Experience (GCP Preferred) - You’ve worked in multi-cloud environments, ideally with strong exposure to GCP. We use Azure and GCP, so it’s expected you would have a solid understanding of Azure. AWS experience is welcome where you can demonstrate transferable cloud principles. We value adaptability across platforms over experience limited to a single cloud provider.
Incident Leadership & Operational Experience - You’ve played an active role in incident response and have led large-scale or complex incidents as an Incident Commander (or equivalent). You understand the operational and communication challenges involved and can lead calmly, clearly, and effectively under pressure.
High Agency & Ownership - You take initiative and proactively identify problems before they escalate. You don’t wait for direction — you step into ambiguity, take accountability, and drive outcomes with urgency and care.
Emotional Intelligence & Low Ego - You communicate clearly and respectfully, value diverse perspectives, and contribute positively to team culture. You’re open to feedback, collaborate well across functions, and prioritise collective success over individual recognition.
Curiosity, Ambition & Empathy - You’re motivated to improve systems and outcomes, asking thoughtful questions and seeking deeper understanding. You balance ambition with empathy, ensuring high performance supports — rather than undermines — team cohesion and psychological safety.
Benefits
Competitive salary Range: Depending on level of experience of $145 - 185NZD per year (IC4)
Employee share scheme: You’ll own part of a company making a real difference!
Flexibility: We are hard-working and outcome focused, but recognise there is more to life than work. We promote a healthy work/life blend.
Shorter work weeks (at full pay): Everyone gets Friday afternoons off, so you can start your weekend early, and do more of whatever it is that makes you happy.
Health Care Plan: In partnership with Nib, Auror covers 100% of the cost of your individual health insurance plan.
Focus on mental and physical health: We understand how vital our health is and have policies to support your wellness, including: Wellness Days, and up to three expert sessions paid for every year.
Family-friendly: We offer comprehensive parental leave and benefits for primary and non-primary caregivers, including a baby bonus and meals delivered to your door.
Personal growth: We support our team to participate in courses, conferences, or events that will help them develop their skills.
Team love: We have regular team lunches and social events where most (if not all) activities are during work hours.
Own availability and strive for operational excellence of Sumo Logic’s observability. Collaborate with global SRE team to optimize operations and improve developer velocity.
Senior Executive supporting technology initiatives in Pune, India. Collaborating globally to connect people and solve complex challenges in a sustainable manner.
DevOps Engineer leading the design, implementation, and optimisation of Kubernetes platforms for Vodafone. Collaborating with product teams to streamline operational processes and enhance developer experience.
Senior Site Reliability Engineer developing scalable systems and automation for high - scale projects at Euna Solutions. Collaborating closely with software developers and mentoring junior engineers.
Senior Site Reliability Engineer responsible for designing scalable systems at Euna Solutions. Collaborating with developers and mentoring juniors while driving automation and reliability.
Senior Site Reliability DevOps Specialist at Boeing overseeing GCP cloud environment and infrastructure. Ensuring reliability, scalability, and automation while collaborating with distributed teams.
Lead DevOps Engineer driving modernization and operational excellence for Enterprise Payments at American Family Insurance. Collaborate across teams and enhance payment processing capabilities.
Senior DevOps Engineer at Fidelity leading operational excellence of production reporting applications. Responsible for stability, reliability, and cloud migration initiatives in a hybrid work environment.
Senior Site Reliability DevOps Specialist for Boeing, focusing on cloud technology and automation in GCP environments. Collaborate globally to enhance system reliability and performance with a diverse tech stack.
SRE Team Lead in charge of reliability strategy and operational maturity for a cybersecurity SaaS platform. Leading a specialized team to enhance system performance and incident management.