Hybrid Site Reliability Engineer

Posted last week

Apply now

About the role

  • Site Reliability Engineer building and scaling cloud infrastructure at fintech startup Rainforest. Owning systems from infrastructure design to production reliability in fast-moving environments.

Responsibilities

  • Owning and scaling Rainforest’s Amazon Web Services (AWS)-based cloud infrastructure using Terraform and infrastructure-as-code (IaC) orchestration
  • Building, operating, and continuously improving Elastic Kubernetes Service (EKS) and serverless environments that support our core payments services
  • Designing and maintaining modern CI/CD pipelines with GitLab to enable fast, safe deployments
  • Implementing and evolving monitoring, alerting, and observability to ensure high uptime and quick incident resolution using tools like OpenTelemetry, Prometheus, and New Relic
  • Automating infrastructure and operational processes to eliminate manual work and accelerate delivery
  • Working side-by-side with application engineers to improve system performance, reliability, and scalability
  • Leading incident response efforts, conducting postmortems, and driving continuous improvement
  • Helping to define and roll out SRE best practices, including SLIs, SLOs, and error budgets as the company scales
  • Optimizing for cost, security, and compliance in a regulated fintech environment
  • Supporting and scaling Postgres database infrastructure using AWS RDS offerings (Global Aurora)

Requirements

  • 3+ years of experience in SRE, DevOps, or cloud infrastructure roles (startup or high-growth experience a plus)
  • Passion for building reliable systems that scale with the business
  • Strong hands-on experience with cloud infrastructure (AWS, Google Cloud, Azure)
  • Deep experience with IaC using tools such as Terraform, OpenTofu, Terragrunt, and CloudFormation
  • Solid production experience with container orchestration (Kubernetes, ECS)
  • Experience building CI/CD pipelines using tools like GitLab and GitHub Actions
  • Strong understanding of monitoring and observability principles and design and providing dashboards, visualizations and alerts
  • Proficiency in at least one modern programming language (e.g., Python, Java, Go, or Ruby).
  • Bachelor’s degree or equivalent work experience in the areas of Information Science, Computer Science, or related disciplines is preferred

Benefits

  • comprehensive health benefits package
  • unlimited paid time off
  • paid parental leave
  • fun and flexible working environment
  • continuously invest in our people and our culture

Job title

Site Reliability Engineer

Job type

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job