Hybrid Site Reliability Engineer

Posted 2 hours ago

Apply now

About the role

  • Site Reliability Engineer at Plenful maintaining system performance and reliability. Collaborating with teams to improve operations and ensure system stability in a fast-paced environment.

Responsibilities

  • Maintain and evolve alerting so engineers receive clear, actionable signals for anomalies, latency regressions and reliability risks
  • Define observability standards across metrics, logs and tracing with a focus on reliability, performance and customer impact instead of vanity data
  • Investigate performance bottlenecks across our distributed systems including serverless task execution, containerized services, workflow orchestration and Postgres
  • Lead incident response, coordinate root cause analysis and ensure reliability improvements are fully implemented and measured
  • Improve the reliability of our distributed task processing, including autoscaling behavior, execution patterns, retry logic, rate limiting and failure isolation
  • Support the stability of our serverless pipelines that process high volume workloads across multiple execution layers
  • Partner with backend and ML teams on designing resilient mechanisms for scheduling, queueing and workflow execution
  • Maintain efficient and predictable resource usage across compute, networking and storage
  • Support security and compliance work including patching, audit readiness and vulnerability management
  • Participate in the on-call rotation and respond to production incidents quickly and calmly with a focus on restoring stable service and clear communication
  • Contribute to blameless postmortems, drive follow through on fixes and ensure learnings are documented for future engineers

Requirements

  • 5+ years of professional engineering experience in a B2B, SaaS company
  • Strong experience operating production systems in cloud environments, ideally AWS
  • Hands-on experience with serverless compute patterns, containerized services, distributed workflows and Postgres
  • Solid understanding of observability tooling, performance debugging and system behavior under load
  • A high ownership mindset, empathy for teammates, straightforward communication and a one team attitude
  • Comfortable working in a fast paced startup environment with a bias for action and thoughtful engineering judgment

Benefits

  • Enjoy unlimited PTO
  • Fully covered health insurance (medical, dental, and vision)
  • Meal stipend
  • Health & wellness stipend
  • 401(k) matching
  • Stock options

Job title

Site Reliability Engineer

Job type

Experience level

Mid levelSenior

Salary

Not specified

Degree requirement

No Education Requirement

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job