Senior Azure Site Reliability Engineer ensuring the reliability and performance of the Vew SaaS platform on Microsoft Azure. Collaborating with teams to design and implement resilient systems.
Responsibilities
Implement and maintain highly available, scalable, and fault-tolerant systems on Azure
Monitor system health and performance metrics to ensure reliability and proactively address issues
Develop and maintain automation scripts and tools for provisioning, deployment, monitoring, and scaling of services
Configure and maintain monitoring solutions to provide real-time visibility into system health and performance
Respond to and resolve incidents, including root cause analysis, mitigation, and communication with stakeholders
Ensure systems and infrastructure adhere to security best practices and compliance requirements
Identify areas for optimization and implement solutions to improve system reliability, performance, and efficiency
Requirements
Bachelor's degree in Computer Science, Engineering, or related field
Proven experience as a Site Reliability Engineer or similar role, preferably in a SaaS environment
Strong proficiency in Microsoft Azure services, including compute, networking, storage, and monitoring
Experience with automation tools and scripting languages such as PowerShell
Solid understanding of containerization technologies (e.g., Docker, Kubernetes) and orchestration tools
Experience with Bicep/Terraform and ARM templates for Infrastructure as Code (IaC)
Hands-on experience with monitoring and logging tools such as Azure Monitor, Grafana, Prometheus, or Datadog
Knowledge of security best practices, compliance standards (e.g., ISO27001, SOC 2, GDPR), and relevant regulations
Excellent problem-solving skills and the ability to troubleshoot complex technical issues
Strong communication and collaboration skills, with the ability to work effectively in a cross-functional team environment
Azure certifications such as Azure Administrator Associate or Azure Solutions Architect Expert are a 'nice to have'.
Junior and DevOps Engineers designing and running secure cloud - native platforms for UK public - sector organisations. Collaborating with teams to streamline deployment and automate infrastructure workflows.
DevOps Engineer at Gemba designing secure, cloud - native platforms for public - sector organizations. Leading technical decisions and collaborating to solve complex challenges for critical systems.
DevOps Engineer designing and constructing secure cloud - native platforms for public - sector organizations across the UK. Leading technical decisions while collaborating closely with clients.
DevOps Engineer automating cloud - native infrastructure for public - sector organizations. Join an agile team to enhance deployment processes and support critical systems.
Site Reliability Engineer optimizing global trading infrastructure for a crypto capital markets partner. Responsibilities include cloud environment management and system design for high availability.
DevOps Engineer responsible for implementing and operating CI/CD pipelines for SaaS services. Collaborating with teams to ensure reliable and secure operations in the Risk & Fraud business unit.
Site Reliability Engineer focused on building resilient systems and ensuring uptime at MealSuite. Involved in troubleshooting, platform reliability, and enhancing deployment automation.
(Senior) DevOps Engineer at Wavestone developing and operating complex software solutions for digitalization projects. Collaborating in teams and contributing to technology landscape advancements.
Reliability Engineer focused on the dependability and mission success of complex space systems. Involvement includes analyses, collaboration, and adherence to aerospace reliability standards.