Site Reliability Engineer responsible for building and maintaining cloud infrastructure at Tricentis. Collaborating with product engineers and enhancing operational processes for seamless scaling with innovative solutions.
Responsibilities
Design, build, and maintain the product cloud infrastructure that enables seamless scaling
Develop advanced monitoring systems that proactively alert on symptoms
Leverage tools like Terraform, GitHub actions, and Kubernetes to efficiently manage AWS or AZURE infrastructure
Continuously enhance operational processes, making deployments, upgrades, and other tasks as boring and automated as possible
Collaborate with product engineers on daily basis and influence product architectures designs
Be part of an on-call rotation to respond swiftly to incidents affecting availability
Act as a reliability champion for stable counterpart assignments
Propose innovative ideas and solutions within the SRE organization and engineering
Proactively identify opportunities to enhance system availability and performance
Share learnings with the wider community
Be the first responder during emergencies and on-call duties
Requirements
Proficiency in Terraform syntax and GitHub Actions configuration
Working knowledge of SaaS architecture concepts and designs
Understanding of Kubernetes, including CLI usage and service re-provisioning
Ability to provision and set up metrics along with managing alerts and silences
Identify Service Level Indicators (SLIs) that align the team with availability and latency objectives
Experience with Linux operating system configuration, package management, and troubleshooting
Working experience with cloud environments like AZURE or AWS and provisioning infrastructure there
Good cultural fit: clear communication, empathy, curiosity & continuous learning, no blame attitude, but instead supportive
DevOps Engineer working closely with engineering and security teams to optimize CI/CD pipelines and manage infrastructure. Ensuring security and compliance for mission - critical financial applications.
Build and scale cloud infrastructure that powers Heidi's healthcare AI platform. Work with AWS and Azure while enhancing automation and reliability in an innovative healthtech startup.
Infrastructure - as - Code DevOps Engineer designing and managing cloud - native platforms at Vodafone. Collaborating with agile teams for digital transformation and business success.
Director of Data Engineering leading a strategic DevOps team within Enterprise AI. Balancing leadership with hands - on expertise to enable AI technology adoption.
Join a Data Engineering Team as a Senior DevOps to support multiple Data & AI initiatives. Utilize cloud technologies and enhance data pipelines in a collaborative environment.
Principal Site Reliability Engineer at Early Warning designing performance and resiliency patterns for applications and infrastructure. Collaborating with development teams to improve systems and data integrity.
DevOps Engineer contributing to CI/CD setup and Azure services management. Collaborates with teams to ensure efficient project delivery in a hybrid environment.
IT DevOps Specialist at BMW responsible for analyzing requirements and implementing software solutions in AWS cloud environments. Collaborating internationally within agile teams for digital transformation projects.
DevOps Engineer at Vistra designing, implementing, and maintaining robust CI/CD pipelines and cloud infrastructure. Enabling software delivery across multiple technology stacks with a focus on AWS.
Manage complex customer rollouts and initial system deployments at Talex.ai. Bridging technical development with real - world application in robotics and AI systems.