Senior Manager of Site Reliability Engineering overseeing Workday Kubernetes based platform. Leading teams while ensuring high availability and collaborating with federal agencies.
Responsibilities
Manage and lead the teams ensuring the Workday Kubernetes based platform is maintained and healthy
Maintain core platform components for high availability, scalability, and security
Automate infrastructure provisioning and application deployments using tools like Terraform and Argo CD
Provide support and solve platform-related issues collaborating with development teams
Implement and maintain security standard methodologies ensuring compliance
Build and maintain comprehensive documentation for platform components and processes
Actively participate in knowledge sharing within the team
Coach and mentor team members for career growth
Requirements
5+ years of managing and leading site reliability engineering teams
5+ years of hands-on experience working with large scale cloud infrastructure, automation, and DevOps methodologies
Bachelor's degree in a computer related field or equivalent work experience
Proficiency in infrastructure automation tools like Terraform
Experience with building, maintaining, and consuming CI/CD pipelines and tools like Argo CD
Strong analytical and problem-solving skills
Deep understanding of Agile Methodology principles
Strong understanding of Continual Improvement Process principles
Benefits
Workday Bonus Plan eligibility
Role-specific commission/bonus
Annual refresh stock grants
Flexible working hours
Professional development opportunities
Job title
Senior Manager, Site Reliability Engineering, Operations
Site Reliability Engineer focusing on AWS cloud environments, SRE practices, and system reliability within GFT's team. Collaborating on cloud migrations and observability initiatives.
Senior DevOps Analyst enhancing infrastructure automation in a transformative technology firm. Collaborating on innovative projects in sectors like healthcare, finance, and utilities in Brazil.
Consultant at Minsait supporting technical decisions in infrastructure automation and developing solutions. Collaborating with teams for maintaining and evolving automation platforms.
Practical Trainee focusing on hardware reliability engineering at Sonova. Support reliability improvement initiatives and work closely with experienced engineers on real - life product challenges.
Configuration Management Engineering Technician supporting naval shipbuilding projects with engineering documentation and configuration integrity. Establishing and maintaining relationships with stakeholders in the shipbuilding community.
Principal Configuration Management Engineering Technician contributing to major shipbuilding programs for national security. Leading Configuration Management teams and ensuring data integrity for advanced naval vessels.
Senior Configuration Management Engineering Technician at Babcock supporting naval engineering programmes across multiple ship configurations. Influencing critical decisions and contributing to engineering outcomes for national defence.
DevOps Engineer designing and managing scalable Azure cloud infrastructure for a financial technology company. Collaborating with teams to enhance system reliability and automate application delivery pipelines.
DevOps Engineer responsible for designing and managing Azure cloud infrastructure for a financial services provider. Collaborating with development teams to optimize system reliability and security.
Senior DevOps Engineer responsible for scaling and securing infrastructure behind healthcare AI platform. Collaborating with teams to deliver integrations and drive automation best practices.