About the role

  • Lead and mentor a team of two SRE Engineers, providing technical guidance and career development
  • Work closely with the CTO to define and implement the technical infrastructure roadmap
  • Establish monitoring strategies and implement solutions to enhance reliability, scalability, and cost-efficiency
  • Collaborate with development team leads to optimise build, test, and deployment processes
  • Lead incident response and establish processes for troubleshooting production issues
  • Organise and oversee on-call rotations to ensure 24/7 system reliability
  • Drive documentation standards and knowledge sharing within the engineering organisation

Requirements

  • 5+ years of experience in DevOps or Site Reliability Engineering roles, with 2+ years in a managerial position
  • Proven experience managing and mentoring technical team members
  • Proficiency in at least one backend programming language (We use Python)
  • Strong knowledge of AWS services (ECS, S3, RDS, Lambda, etc.), managed by Terraform
  • Knowledge of observability frameworks and tools (We use OpenTelemetry, Cloudwatch & DataDog)
  • Excellent leadership, communication, and problem-solving skills
  • Experience with AI/ML infrastructure deployment and scaling

Benefits

  • Generous equity scheme - everyone gets to be an owner of Robin AI!
  • 20 days PTO, in addition to the public holidays observed in South Africa.
  • Growth opportunities: We prioritise promotions for high performers and help you to progress your career.

Job title

Lead SRE

Job type

Experience level

Senior

Salary

Not specified

Degree requirement

No Education Requirement

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job