Site Reliability Engineer responsible for building and maintaining cloud infrastructure at Tricentis. Collaborating with product engineers and enhancing operational processes for seamless scaling with innovative solutions.
Responsibilities
Design, build, and maintain the product cloud infrastructure that enables seamless scaling
Develop advanced monitoring systems that proactively alert on symptoms
Leverage tools like Terraform, GitHub actions, and Kubernetes to efficiently manage AWS or AZURE infrastructure
Continuously enhance operational processes, making deployments, upgrades, and other tasks as boring and automated as possible
Collaborate with product engineers on daily basis and influence product architectures designs
Be part of an on-call rotation to respond swiftly to incidents affecting availability
Act as a reliability champion for stable counterpart assignments
Propose innovative ideas and solutions within the SRE organization and engineering
Proactively identify opportunities to enhance system availability and performance
Share learnings with the wider community
Be the first responder during emergencies and on-call duties
Requirements
Proficiency in Terraform syntax and GitHub Actions configuration
Working knowledge of SaaS architecture concepts and designs
Understanding of Kubernetes, including CLI usage and service re-provisioning
Ability to provision and set up metrics along with managing alerts and silences
Identify Service Level Indicators (SLIs) that align the team with availability and latency objectives
Experience with Linux operating system configuration, package management, and troubleshooting
Working experience with cloud environments like AZURE or AWS and provisioning infrastructure there
Good cultural fit: clear communication, empathy, curiosity & continuous learning, no blame attitude, but instead supportive
Senior Site Reliability Engineer managing the reliability and operational health of the Loan Origination System for a fintech company. Collaborating with engineering teams in Brazil and the US to improve system reliability.
Cloud Engineer working with Azure DevOps and digital transformation in a global team at EY. Collaborating on cloud engineering projects and supporting CI/CD pipeline development.
DevOps Engineer creating better conditions for developers in Saab's defence technology. Collaborating with developer teams for effective continuous development and delivery of software.
Ingénieur Infrastructure DevOps chez Bull, renforçant l'équipe AdminLab Echirolles. Travailler sur des infrastructures Linux et des pratiques d'automatisation dans un environnement HPC.
Product Quality & Reliability Engineer developing quality/reliability standards for Applied Materials. Design methods for testing products and analyze operational data in a supportive team environment.
DevOps System Engineer creating and managing infrastructure for ESET's global SaaS service. Collaborating with tech teams to maintain secure and stable operations.
Provides expertise in business applications design and functionality. Supports users and validates technical designs for alignment with business needs.
Senior Site Reliability Engineer supporting the reliability and performance of Broadridge’s fintech platform. Collaborating with senior engineers on automation, infrastructure, and production stability.
DevOps Engineer at Mindera focusing on Windows environments and Azure cloud solutions. Involves system modernization, automation, and migration projects with collaborative teams.
DevSecOps Lead supporting Synthesized's cloud automation strategy with a focus on security and compliance. Collaborating closely with development teams to shape cloud architecture and enhance deployment processes.