Hybrid Site Reliability Engineer – Operations

Posted 1 hour ago

Apply now

About the role

  • Site Reliability Engineer responsible for system reliability and performance at a leading financial services technology company. Collaborating with infrastructure, engineering, and security teams to build robust systems.

Responsibilities

  • Maintain and improve the uptime, performance, and availability of production systems.
  • Define and track SLIs , SLOs , and SLAs to ensure service reliability and user satisfaction.
  • Implement and manage monitoring, alerting, and observability tools (e.g., Prometheus, Grafana, Datadog, ELK).
  • Participate in on-call rotations and respond to incidents, performing root cause analysis and postmortems.
  • Automate repetitive tasks and processes using scripts, configuration management, and Infrastructure as Code (IaaC).
  • Develop CI/CD pipelines to streamline deployment and operational processes.
  • Analyze system performance and capacity trends to plan for future growth.
  • Collaborate with engineering teams to design systems that scale reliably.
  • Support cloud and/or hybrid infrastructure (AWS, Azure, GCP, VMware, etc.).
  • Manage system provisioning, configuration, and patching via tools such as Ansible, Terraform, or Puppet.
  • Act as a bridge between development and operations teams, championing DevOps and SRE principles.
  • Contribute to a culture of continuous improvement, reliability, and accountability.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
  • 3+ years of experience in a Site Reliability, DevOps, or Systems Engineering role.
  • Experience with Linux/Unix systems , Windows , shell scripting, and administration.
  • Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.).
  • Hands-on experience with cloud platforms ( AWS , Azure , or GCP ).
  • Strong knowledge of networking, security, load balancing, and DNS.
  • Experience with monitoring/logging tools (e.g., Prometheus, Grafana, ELK, Splunk, Datadog).

Benefits

  • Flexibility : Hybrid Work Model & a Business Casual Dress Code, including jeans
  • Your Future: 401k Matching Program, Professional Development Reimbursement
  • Work/Life Balance: Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays
  • Your Wellbeing: Medical, Dental, Vision, Employee Assistance Program, Parental Leave
  • Diversity & Inclusion: Committed to Welcoming, Celebrating and Thriving on Diversity
  • Training: Hands-On, Team-Customized, including SS&C University
  • Extra Perks: Discounts on fitness clubs, travel and more!

Job title

Site Reliability Engineer – Operations

Job type

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job