Hybrid Senior AI Site Reliability Engineer

Posted last month

Apply now

About the role

  • Reports to: Senior Director, Platform Engineering.
  • Develop and deploy AI/ML-driven solutions for monitoring, anomaly detection, and predictive alerting to improve system reliability and reduce MTTR.
  • Use AI techniques to optimize capacity planning, autoscaling, and resource utilization across distributed systems.
  • Automate repetitive operational tasks with intelligent agents and large-scale data analysis.
  • Integrate LLMs and generative AI into incident response, post-mortem analysis, and business continuity.
  • Partner with platform and product engineering teams to embed AI-based observability into services from the ground up.
  • Continuously evaluate new AI/ML methods and tools to expand SQUIRE’s AI-driven SRE capabilities.
  • Drive a culture of experimentation: build prototypes, run pilots, measure results, and productionize what works.
  • Mentor engineers on applying AI approaches to reliability problems; help establish standards and best practices.

Requirements

  • 5+ years of experience in Site Reliability Engineering, DevOps, or related roles.
  • Proven experience using AI/ML (supervised learning, anomaly detection, LLMs, etc.) to solve operational or reliability problems.
  • Strong background in distributed systems, cloud infrastructure (AWS Preferred), and container orchestration (Docker, ECS, Elastic Beanstalk).
  • Proficiency with observability stacks (Datadog, Sentry, Prometheus, etc.).
  • Solid programming/scripting skills in Python, Go, or similar — with experience integrating ML/AI libraries and APIs.
  • Hands-on with automation frameworks and infrastructure as code (Terraform, CloudFormation, etc.).
  • Excellent analytical and problem-solving skills, with the ability to innovate in operational domains.
  • Strong communication and collaboration skills across technical and non-technical stakeholders.
  • English proficiency is a must. It's important you can communicate your ideas clearly as you will be interacting with English-speaking coworkers.
  • Must be based in Buenos Aires.
  • Availability to work on-site in our office in CABA two days a week (Tuesdays and Thursdays).

Job title

Senior AI Site Reliability Engineer

Job type

Experience level

Senior

Salary

Not specified

Degree requirement

No Education Requirement

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job