Develop, scale, and operate OpenShift managed cloud services
Contribute code to increase the scalability and reliability of the service
Help and develop peers’ capabilities through knowledge sharing, mentoring, and collaboration
Participate in a regular on-call schedule, including occasional paid weekends and holidays
Practice sustainable incident response and blameless postmortems
Resolve customer issues escalated from the Red Hat Global Support team
Work within a small agile team to develop and improve SRE software, support peers, plan and self-improve
Proactively utilize AI-assisted development tools for code generation, auto-completion, and intelligent suggestions
Participate in AI-assisted code reviews, utilizing tools that provide real-time feedback
Collaborate with cross-functional teams to identify opportunities for AI integration within the software development lifecycle
Requirements
Bachelor’s degree in Computer Science, Engineering, or related field; equivalent practical experience will also be considered.
5+ years of experience in at least one programming language (Python, Golang, Java)
Hands-on experience with public cloud platforms (AWS, GCP, Azure). Preferably Azure
4+ years of experience with Kubernetes OR Openshift
Experience with Docker based containers
Strong collaboration and problem-solving skills in distributed, team-based environments.
Experience troubleshooting as-a-service offerings (SaaS/PaaS) and working with complex distributed systems.
Working knowledge of Linux/Unix operating systems.
Proven ability to automate repetitive tasks and debug performance issues.
5+ years of experience managing Linux servers running Red Hat Enterprise Linux (RHEL), CentOS, or Fedora hosted at a cloud provider such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is a plus
3+ years of experience with enterprise configuration management software like Ansible by Red Hat, Puppet, or Chef
2+ years of experience delivering a hosted service
Demonstrated ability to quickly and accurately troubleshoot system issues
Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP
Solid communications skills and experience working directly with and presenting to customers
Benefits
Health insurance
Professional development opportunities
Flexible working arrangements
Job title
Senior Site Reliability Engineer – Openshift/Kubernetes, Golang, Linux
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.
Platform Engineer focusing on supporting CI/CD pipelines and Kubernetes at PCCW. Responsible for ensuring platform services' reliability and performance, with night - time support as needed.
Site Reliability Engineer at Bumble optimizing large - scale Linux environments and ensuring system stability. Focusing on troubleshooting, incident recovery, and performance tuning in complex infrastructures.
Senior DevOps Manager overseeing CI/CD processes for NVIDIA Networking products. Leading a team and collaborating with global teams to enhance R&D efficiency and infrastructure.
DevOps Manager overseeing engineering team developing scalable CI/CD processes for NVIDIA Networking products. Enhancing global R&D efficiency in a technology - focused company.
Join Operations Team as Senior Site Reliability Engineer driving operational excellence for cybersecurity solutions. Collaborate across teams to manage production platforms and optimize infrastructure.
Software Developer - DevOps System Administrator working within the SCMT team to enhance software application efficiency. Collaborating on tools and scripts for application lifecycle management.
DevOps Engineer managing CI/CD pipelines and Kubernetes deployments at Stefanini. Collaborating with teams to optimize application health and deployment processes.
DevOps Engineer working with development teams for seamless feature integration and deployment automation. Focus on CI/CD pipelines, monitoring solutions, and continuous process optimization.