Site Reliability Engineer at McKesson focusing on reliability, scalability, and performance of healthcare technology systems. Engaging in automation and monitoring to deliver exceptional user experiences.
Responsibilities
Design, implement, and maintain robust and scalable infrastructure and applications
Develop and implement automation scripts, tools, and processes to streamline operational tasks
Establish and maintain comprehensive monitoring, alerting, and logging systems
Participate in on-call rotations, respond to and resolve critical incidents
Collaborate with development teams to analyze system capacity and optimize resource utilization
Work closely with software engineers, product managers, and other SREs to promote a culture of reliability
Create and maintain clear and concise documentation for systems, processes, and incident runbooks
Contribute to the implementation and enforcement of security best practices
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience)
2+ years of experience in a Site Reliability Engineering, DevOps, or highly related software engineering role
Strong proficiency in at least one scripting language (e.g., Python, Go, Ruby, Bash)
Hands-on experience with cloud computing platforms (e.g., AWS, Azure, GCP)
Experience with container technologies (e.g., Docker) and container orchestration platforms (e.g., Kubernetes)
Familiarity with Continuous Integration and Continuous Delivery (CI/CD) pipelines and tools
Experience with monitoring and observability tools (e.g., Datadog, Prometheus, Grafana, Splunk)
Strong understanding of Linux/Unix operating systems
Fundamental understanding of networking concepts (TCP/IP, DNS, HTTP, Load Balancing)
Excellent analytical and problem-solving skills with a proactive approach
Mechanical/Reliability Engineer responsible for mechanical installations in Bergen op Zoom. Analyzing maintenance strategies and leading projects to enhance reliability.
Senior DevOps Engineer responsible for cloud infrastructure and deployments. Optimizing AWS services and ensuring system security and reliability for Verizon.
Senior DevOps Engineer responsible for automating infrastructure and building CI/CD pipelines for collaborative robotics company. Collaborating with global engineering teams from the Bangalore office.
Site Reliability Engineer Intern at Tencent working on gaming services and cloud native solutions. Collaborating with global teams to eliminate toil and enhance reliability.
Cloud/DevOps Specialist at N5X managing and optimizing critical cloud infrastructures for Brazilian energy trading. Collaborating with a multidisciplinary team to ensure high availability and performance.
Cloud/Devops Specialist responsible for designing a hybrid architecture combining cloud and on - premises infrastructure for energy trading systems. Collaborating with a multidisciplinary team in a dynamic environment.
Reliability Engineering Specialist utilizing reliability tools and models to improve asset performance at Enbridge. Collaborating across teams to guide investment decisions for safe operations.
DevOps Engineer responsible for structuring and supporting cloud DevOps architecture in Brazil. Working strategically on automation and CI/CD practices with development teams in Pernambuco.
DevSecOps Software Engineer developing secure CI/CD pipelines for Boeing's military software systems. Collaborate with cross - functional teams and implement automation and security best practices.
DevOps Manager responsible for managing a team for multi - cloud solutions supporting the USAF Cloud One project. Focus on scalable cloud - native solutions and CI/CD practices.