Senior Data & Site Reliability Engineer at Stefanini ensuring the reliability and operation of data platforms and analytical services.
Responsibilities
El Data & Site Reliability Engineer Senior es responsable de garantizar la confiabilidad, estabilidad y operación continua de las plataformas de datos y servicios analíticos de la organización.
Este rol combina las mejores prácticas de Site Reliability Engineering (SRE) y Data Reliability Engineering (DRE), enfocándose en la prevención de incidentes, automatización de procesos, reducción del tiempo de recuperación ante fallos (MTTR) y mejora de la experiencia operativa de extremo a extremo.
Lidera la definición y gobierno de indicadores de servicio (SLIs/SLOs) como frescura, completitud, latencia, confiabilidad y disponibilidad, impulsando la evolución hacia modelos operativos IOps y NoOps.
Requirements
Mínimo 2 años o más de experiencia en roles de SRE, DRE, DevOps o ingeniería de plataformas de datos en ambientes productivos.
Experiencia comprobable liderando incidentes críticos y proyectos de automatización en entornos de datos.
2+ años de experiencia en roles SRE, DRE, DataOps o Platform Engineering
Dominio de Apache Airflow: gestión de DAGs, depuración, optimización de pipelines
Experiencia con dbt (data build tool): modelos, pruebas, linaje de datos
Conocimiento de Amazon Redshift: administración, optimización de consultas, WLM
Manejo de Grafana + Prometheus: dashboards, alertas, PromQL
Experiencia con OpsGenie o herramienta equivalente de gestión de alertas
Conocimiento de AWS Glue, Lambda, CloudWatch
Familiaridad con metodologías SRE: error budgets, SLOs, SLIs, SLAs
Experiencia con Jira Service Management o herramienta ITSM equivalente.
DevOps Engineer designing CI/CD pipelines and managing Azure cloud infrastructure for leading organizations. Collaborating with global teams and automating deployment processes across projects.
Senior DevOps professional at iugu managing system reliability and performance in a dynamic environment. Collaborating with development teams and automating processes for efficiency.
Site Reliability Engineer maintaining the ShiftKey Marketplace platform while ensuring its stability and availability. Collaborating on infrastructure projects and support with a remote - first approach.
Site Reliability Engineer ensuring platform stability and managing AWS migration. Focused on hands - on maintenance work and engineering automation for healthcare staffing platform.
Site Reliability Engineer maintaining stability and availability of healthcare staffing platform while collaborating with engineering teams on AWS migration projects.
Site Reliability Engineer for ShiftKey, ensuring stability and performance of healthcare management platform. Involves maintenance and development initiatives with a proactive approach to prevent incidents.
DevOps Team Lead managing deployment and operations of FedRAMP authorized products at Semperis. Lead a team in a regulated environment focusing on security and process improvement.
Senior DevOps Engineer responsible for deployment and secure operations of FedRAMP products at Semperis. Focusing on compliance, automation, and collaborating with security teams.
DevOps/IT Apprentice supporting cloud infrastructure and CI/CD pipelines at tech startup. Involves learning, taking ownership, and growing within the engineering team.