Hybrid AI Infrastructure Engineer

Posted 2 days ago

Apply now

About the role

  • AI Infrastructure Engineer at Xsolla designing AI/ML solutions for multi-cloud infrastructure. Collaborating on automation workflows and observability systems for improved infrastructure management.

Responsibilities

  • Design and implement AI/ML-powered solutions for infrastructure use cases, including predictive autoscaling, anomaly detection, intelligent cost optimization, and automated remediation across GCP and multi-cloud environments
  • Build and maintain AI-driven monitoring and observability systems that correlate logs, metrics, and traces to surface root causes, predict bottlenecks, and reduce mean time to resolution (MTTR)
  • Develop and operate automated incident response workflows using AI-powered playbooks that diagnose, contain, and resolve infrastructure issues with minimal manual intervention
  • Integrate AI tooling into CI/CD pipelines to improve deployment reliability, automate test prediction, score release health, and support rollback automation
  • Contribute to the development of internal AI agents and virtual assistants integrated into developer workflows (Slack, IDEs, Confluence) — enabling self-service for provisioning, troubleshooting, and infrastructure guidance
  • Implement AI/ML-based anomaly detection and automated vulnerability management workflows to enhance the security posture of Xsolla's infrastructure
  • Prototype and productionize Generative AI solutions for infrastructure automation, including auto-generation of Terraform/Puppet modules, IaC configurations, runbooks, and change documentation
  • Collaborate with senior engineers and leadership to evolve and execute the infrastructure AI strategy across its implementation phases
  • Maintain clear documentation of AI tools, integrations, and automated workflows; share knowledge and best practices across the team

Requirements

  • 5–7 years of experience in infrastructure engineering, DevOps, SRE, or a related field
  • Hands-on experience with GCP (priority) and/or AWS; solid understanding of cloud resource management, scaling, and cost structures
  • Practical experience building or integrating AI/ML-powered tools in an operational context (anomaly detection, predictive models, LLM-based automation, or similar)
  • Experience with infrastructure-as-code tools — Terraform, Puppet, Ansible, or equivalent
  • Proficiency in Python for scripting, automation, and AI/ML integration; Bash or Go a plus
  • Working knowledge of Kubernetes and container orchestration in production environments
  • Familiarity with observability and monitoring stacks (Prometheus, Grafana, ELK, Datadog, or similar)
  • Familiarity with LLM APIs (OpenAI, Anthropic, or similar) and prompt engineering for operational use cases
  • Strong problem-solving mindset with a bias toward automation and eliminating toil
  • Fluent in English (written and verbal)

Job title

AI Infrastructure Engineer

Job type

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job