Platform Engineer responsible for developing and managing Kubernetes environments for AI solutions in healthcare. Collaborating with teams to enhance core infrastructure and streamline deployments.
Responsibilities
Design, deployment, and management of scalable and secure Kubernetes clusters on OVHcloud.
Ownership and advancement of our CI/CD pipelines for automated, reliable application and infrastructure deployments.
Implementation and management of our GitOps workflows using tools like ArgoCD or Flux.
Management and scaling of GPU workloads in Kubernetes, ensuring optimal performance and resource utilization for our ML teams.
Development and maintenance of our observability stack (VictoriaMetrics, VictoriaLogs, Grafana, Tracing) to ensure deep visibility into system health.
Management of our cloud infrastructure on OVHcloud, focusing on automation (Infrastructure as Code), cost optimization, and security.
Lifecycle management of core platform services, including message brokers (RabbitMQ), databases (PostgreSQL, Redis), and authentication systems (Okta, OIDC, OAuth2).
Acting as a key responder for infrastructure incidents; debugging and troubleshooting complex production issues across distributed systems.
Supporting and empowering development teams by providing robust self-service tools, clear documentation, and collaborative support.
Requirements
3-5+ years of professional experience in a Platform Engineering, DevOps, or SRE role
Deep, hands-on experience with Kubernetes in a production environment (cluster management, networking, security, scheduling)
Proven experience managing infrastructure on a cloud provider (OVHcloud is a strong plus; AWS, GCP, or Azure experience is also valued)
Strong practical knowledge of CI/CD systems (e.g. GitHub Actions) and GitOps principles (ArgoCD, Flux)
Proficiency with Infrastructure as Code (IaC) tools like Terraform or Pulumi
Solid understanding of observability principles and tools (e.g. VictoriaMetrics, VictoriaLogs, OpenTelemetry/Tracing, Grafana)
Experience managing stateful services in production (e.g. PostgreSQL, Redis, RabbitMQ)
Solid scripting skills in Python
Benefits
Full ownership of a mission-critical platform
A team that values curiosity, learning, and experimentation
Remote-first setup with the option to work in our Berlin office
Power Platform Developer designing and maintaining Power Platform solutions for a global technology leader. Collaborating with teams to deliver technology solutions that enhance productivity and sustainability.
Support with architecture, design, and implementation of Kubernetes environments. Involving in CI/CD pipelines, multi - cloud orchestration, and providing relevant content for clients.
Platform Engineering Manager leading engineering of Anglian Water’s hybrid digital platforms. Focusing on secure and scalable cloud and on - premise infrastructure while enabling digital service delivery.
Platform Engineer responsible for maintaining uptime and stability of robot testing platforms. Collaborating with teams for high reliability in testing environments for autonomous vehicles.
Platform Engineer for Pfizer’s Data and AI Platforms team, developing Azure solutions and pro - code AI agent applications. Leading engineering and operations for a scalable enterprise - grade platform.
Senior Embedded Platform Engineer developing low - level embedded software for Ford's Electric Vehicles and the future of transportation. Collaborating with agile teams to ensure functionality and efficiency.
Senior IT Engineer enhancing cloud platform and infrastructure reliability at Xcel Energy. Collaborating with teams to influence platform strategy and deliver high - impact capabilities.
Platform Engineer developing Kubernetes solutions supporting multi - tenant platforms at Bundesdruckerei in Berlin. Collaborating on innovative digital solutions for identity and data protection.
Lead Platform Engineer at PGIM Private Capital focusing on cloud modernization. Collaborate with cross - functional teams to develop cloud - based applications in a hybrid work environment.
AI Platform Engineer designing and deploying AI and ML platforms at Utica National Insurance Group. Collaborating with internal teams to implement AI solutions and establish observability and telemetry.