Manage and expand cloud infrastructure (Azure, Kubernetes), putting in place IaC best practices
Implement observability instrumentation, dashboards and alerts to monitor product health
Collaborate with engineers on product teams for new infrastructure needs or dev experience improvements and make application-level changes that improve system reliability (e.g. database configuration, JS bundle delivery, caching or network resiliency)
Implement security requirements (HITRUST and SOC2) and collaborate with Technical Program Manager for audits
Maintain databases (PostgreSQL, Redis) with backups, migrations, and security/privacy controls while monitoring performance and stability
Requirements
7+ years of experience in a SRE, Production Engineering, Infrastructure Engineering or related roles
Strong proficiency with SQL, Git, Kubernetes, Bash, and Networking (DNS, SSL, IP)
Familiarity with Azure, JavaScript/TypeScript, Python, Github, and VSCode
Security-oriented mindset and experience implementing security best practices
Benefits
Competitive salary and equity in a high-growth company
Head of CloudOps SRE leading operational excellence and reliability at S&P Global’s multi - cloud infrastructure. Directing teams to enhance security, performance, and cost efficiencies.
Sr. DevOps Engineer at Supplier.io automating software and data delivery lifecycle. Managing CI/CD pipelines and collaborating with development, data, and operations teams in a hybrid environment.
DevOps Product Manager working on complex platform and infrastructure projects. Consulting on DevOps best practices and ensuring scalable, efficient digital ecosystems for clients.
Site Reliability Engineer optimizing large - scale Linux environments at Bumble Inc. Troubleshooting incidents and driving performance improvements on platforms such as Kafka and Kubernetes.
Senior DevOps Engineer at mylo, managing multi - cloud infrastructure and CI/CD pipelines. Promoting DevOps culture while ensuring compliance and automating system maintenance.
Lead Site Reliability Engineer at S&P Global's Cloud Engineering team. Responsible for designing and maintaining cloud infrastructure and ensuring the performance of cloud - based systems.
Site Reliability Engineer responsible for monitoring and improving the reliability of satellite operations infrastructure. Collaborating with teams to automate processes in a dynamic environment.
DevOps Analyst providing high quality and reliable solutions within multifuncional teams at technology - focused financial organization. Automating build and deployment solutions in a hybrid work environment.