Senior DevOps Engineer ensuring the reliability and performance of Kami's applications. Join a dynamic team in Auckland, New Zealand, and support a growing education platform.
Responsibilities
Analyze and optimize system reliability, performance, and resource utilization of cloud infrastructure
Develop and maintain automation scripts for deployment, monitoring, and maintenance tasks.
Implement infrastructure as code (IaC) to automate the provisioning and configuration of infrastructure components.
Design and implement monitoring solutions to proactively identify and address issues.
Participate in on-call rotations and respond to incidents to ensure system stability and performance.
Conduct capacity planning to anticipate future resource needs and optimize infrastructure scalability.
Define and track reliability metrics to measure and improve system performance.
Prepare and present reports on system reliability and performance.
Work closely with software development teams to influence and improve the reliability and scalability of applications.
Conduct post-incident reviews to identify root causes and implement preventive measures.
Troubleshoot complex issues in a production environment.
Requirements
7+ years of experience in a DevOps, SRE or similar role
Bachelor's degree in Computer Science, Information Technology, or a related field.
Relevant experience in software engineering, systems administration, or a related field.
Proficiency in programming languages (e.g. Python, Go, Ruby)
Strong scripting skills for automation tasks (e.g. Bash, Python)
Hands-on experience and in-depth knowledge of cloud platforms (e.g. Google Cloud, AWS) and container orchestration tools (e.g. Kubernetes)
A proficient understanding of core networking concepts (e.g. TCP/IP, DNS, load balancing)
Familiarity with Infrastructure as Code (IaC) tools (e.g. Terraform) and/or configuration management tools (e.g. Ansible, Puppet, Chef)
Experience with infrastructure monitoring, logging and alerting tools (e.g. Datadog, Prometheus, Grafana, PagerDuty), and log analysis
Strong collaboration and communication skills to work effectively with cross-functional teams
Ability to analyze complex systems and troubleshoot issues effectively.
Benefits
A people-first employer that is on an inspiring mission to build the future of education while changing the lives of millions
Continuous learning and development opportunities, including subsidised course fees, certifications, conferences, and free access to Udemy and more
IT DevOps Specialist at BMW responsible for analyzing requirements and implementing software solutions in AWS cloud environments. Collaborating internationally within agile teams for digital transformation projects.
DevOps Engineer at Vistra designing, implementing, and maintaining robust CI/CD pipelines and cloud infrastructure. Enabling software delivery across multiple technology stacks with a focus on AWS.
Manage complex customer rollouts and initial system deployments at Talex.ai. Bridging technical development with real - world application in robotics and AI systems.
Cloud Operations Engineer designing and implementing highly reliable cloud solutions. Leading cloud infrastructure initiatives for production operations and customer success in a growing team.
Quality Engineer supporting new product launches and reliability testing for SSD at Micron in Malaysia. Responsible for coordinating test activities and conducting failure analysis.
Reliability Engineer ensuring operational readiness of data centers at Rowan Digital Infrastructure. Overseeing commissioning, operational standards, and transitioning facilities into live operations.
Manager of Mechanical Engineering ensuring high - availability mechanical systems in data centers. Collaborating on lifecycle management and performance evaluation across missions - critical facilities in a hybrid role.
DevOps Engineer developing reusable Ansible and Puppet modules and managing CI/CD for project teams. Join PLATH in Hamburg, focusing on crisis detection software development.
Senior DevOps Engineer designing and maintaining CI/CD pipelines for a leading connectivity firm. Collaborating with cross - functional teams to optimize cloud infrastructure and enhance operational excellence.
Mechanical Reliability Engineer at Cargill ensuring asset reliability through advanced maintenance practices. Collaborating with teams and overseeing projects in heavy industrial processes.