Principal DevOps Engineer at KingMakers focusing on coding and infrastructure within product squads. Leading technical improvements in observability, reliability, and performance across platforms.
Responsibilities
Provide technical leadership across platform and product engineering, setting the direction for observability, reliability, security, and platform architecture.
Contribute hands-on to critical product and platform codebases (primarily .NET and React), focusing on high-leverage improvements rather than feature throughput.
Define and drive the organisation-wide observability strategy, including standards and best practices for OpenTelemetry (OTEL), telemetry pipelines, and signal quality.
Partner with senior engineers and engineering leadership to design scalable, resilient infrastructure using Terraform, AKS, and Azure.
Establish and champion security-by-default practices across code, infrastructure, and CI/CD — influencing architecture, not just implementation.
Lead the definition of SLIs, SLOs, and alerting strategies, ensuring reliability is measurable, actionable, and aligned with business outcomes.
Set standards for monitoring and dashboards in Grafana, ensuring clarity, signal quality, and consistency across teams.
Design and build shared tooling, platforms, and libraries that materially improve developer productivity and reduce cognitive load.
Act as a trusted technical advisor and bridge between product engineering, platform engineering, and leadership — shaping decisions through influence rather than authority.
Mentor senior engineers and help raise the overall engineering bar through design reviews, architectural guidance, and pragmatic coaching.
Requirements
Deep, production-grade programming experience in at least one major language — C#/.NET strongly preferred; Go, Python, or similar also valued.
A track record of designing, reviewing, and evolving complex production systems at scale.
Extensive experience with cloud platforms (Azure preferred), infrastructure-as-code (Terraform), and Kubernetes-based platforms (AKS).
Strong understanding of CI/CD systems and deployment strategies, ideally using Azure DevOps.
Excellent grasp of networking fundamentals (TCP, HTTP, headers, tracing, and real-world debugging).
Proven experience designing secure systems, including identity, access management, secrets handling, and secure pipelines.
Experience operating in medium-to-large engineering organisations where reliability, scale, and developer experience matter.
The ability to operate autonomously, identify high-impact problems, and drive solutions end-to-end.
Clear communication skills — able to explain complex technical trade-offs to engineers and non-engineers alike.
**Nice to have**
Deep hands-on experience with observability platforms (Prometheus, Grafana, Elastic, OpenTelemetry).
Experience designing or operating large-scale telemetry, monitoring, or logging systems.
Strong database knowledge (SQL Server, NoSQL, performance and reliability considerations).
Experience with Nginx, ingress controllers, or load balancing technologies.
A security-first mindset balanced with pragmatism, performance, and developer velocity.
Experience working across distributed or remote teams and influencing without direct authority.
Benefits
Health Support: A monthly allowance to invest in your health and wellbeing
Comprehensive Insurance: Extra protection and peace of mind for you and your loved ones
Future Planning: Employer contributions towards your long-term financial security
Performance Bonus: Discretionary rewards that celebrate your impact
Convenience at Work: Free reserved parking at the office
Hybrid Work: A flexible arrangement with 2 days in the office and 3 days remote each week
DevOps Engineer building and maintaining authentication platforms in multi - cloud environments. Using technologies like Terraform, Ansible, and Python for automation and optimization.
Cloud Engineer developing Infrastructure - as - Code with Terraform and Azure DevOps. Managing Azure infrastructure and leading incident response within cross - functional teams.
DevSecOps Engineer at Skillfield working on secure CI/CD pipelines for mobile - first delivery. Collaborating with teams to embed security and automation in the delivery lifecycle.
Lead DevOps Engineer focused on AWS and Azure data platform solutions. Collaborating with teams to deliver scalable, secure, and highly available solutions.
DevOps Engineer working at GRÜN Software Group to automate and maintain stable infrastructures. Collaborating with teams to improve deployments and processes for better performance.
Linux System Administrator managing IT infrastructures for educational institutions and research. Collaborating on DevOps and HPC projects while ensuring system security and performance.
Azure SRE Engineer responsible for designing and maintaining secure, scalable Azure cloud infrastructure. Driving automation and operational excellence for leading organizations in technology transformation.
Senior Manager of Site Reliability Engineering overseeing Workday Kubernetes based platform. Leading teams while ensuring high availability and collaborating with federal agencies.
Site Reliability Engineer focusing on AWS cloud environments, SRE practices, and system reliability within GFT's team. Collaborating on cloud migrations and observability initiatives.