Hybrid Platform Engineer – Product Reliability, Mid/Senior Level

Posted 23 hours ago

Apply now

About the role

  • Product Reliability Engineer ensuring availability and scalability for Kraken's energy management platform. Collaborate with diverse teams to enhance system performance and reliability.

Responsibilities

  • Teach and support product teams on best practices for reliability, implementation patterns and effective usage of our existing platforms
  • Support product teams in improving the performance and availability of their systems
  • Be hands-on in code and infrastructure to help product teams with reliability improvements
  • Provide comprehensive feedback to the wider Platform group on improvements to be made to core infrastructure based on observations and first-hand experience in the code base
  • Support the build-out of proof-of-concept requirements in product teams as needed to evolve application deployment architecture to align with business growth as well as enhance scalability and system resilience
  • Collaborate with product teams to support the release of new features and services, ensuring adherence to reliability and performance standards
  • Guide product teams in designing systems for resilience and graceful failure under heavy load
  • Assist application teams with post-incident tasks and follow-ups, and contribute to the creation and review of post-mortem documentation
  • Analyse incident metrics to identify trends and potential improvements, communicating these insights to the product teams
  • Help solve interesting and difficult problems. There’s a great opportunity for disruption in the global energy market

Requirements

  • Great communication skills, working effectively with developers, product managers and other business stakeholders to understand, design and deliver impactful projects and reliability improvements
  • Solid hands-on experience across our core platform stack:
  • **AWS** (supporting and improving cloud infrastructure used by product teams)
  • **Terraform** (infrastructure as code; comfortable operating with Terraform day-to-day)
  • **Kubernetes** (container orchestration and deployment management; comfortable working with Kubernetes day-to-day)
  • Experience using industry-standard observability tooling - we use Datadog, Grafana, Prometheus and Rootly (experience with other monitoring/alerting platforms is transferable)
  • Strong collaboration and communication skills - able to work effectively with developers, product managers, and other stakeholders to design and deliver impactful observability “golden paths” and monitoring experiences
  • Exposure to Python (or a similar C-based language like TypeScript, Go, C#) - able to understand how applications behave in production to support observability and reliability improvements
  • Previous experience working in small, highly autonomous teams
  • A working style that fits how we operate:
  • Comfortable with ambiguity and able to create structure in unclear situations
  • Proactive learning mindset (experiment, iterate, and adapt as the team evolves approaches)
  • Strong asynchronous written communication (Slack/Notion/docs) and a habit of keeping others in the loop
  • Autonomy and accountability - making progress independently and owning outcomes

Benefits

  • We want to ensure you have all the tools and environment you need to unleash your potential
  • If you have any specific accommodations or a unique preference, please contact us at [email protected] and we'll do what we can to customise your interview process for comfort and maximum magic!
  • Kraken is a certified Great Place to Work in France, Germany, Spain, Japan and Australia
  • In the UK we are one of the Best Workplaces on Glassdoor with a score of 4.7

Job title

Platform Engineer – Product Reliability, Mid/Senior Level

Job type

Experience level

Senior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job