Participate in the design phase of latency-driven high scale systems. Build out scalable and reliable infrastructure solutions for new projects. Write scripts to monitor systems and automate routine tasks. Maintain our infrastructure-as-a-code system. Design and develop tooling to assist development teams. Experiment new tools and/or processes to improve the team routines and communication. Troubleshoot issues across the entire stack (diagnose software, application and network). Document current and future configuration processes and policies. Take part in a shared 24x7 on-call rotation to ensure high-availability of our systems.
Requirements
Ability to prioritize tasks and work autonomously. Good knowledge of Configuration management and IaC ( Ansible or Terraform ). Practical experience with monitoring stack Prometheus/Grafana. Practical knowledge of shell scripting and programming languages such as Python. Experience with database technologies ( MySQL , Redis). Experience with cloud computing platforms ( AWS/GCP/Azure ). Good knowledge and practical experience with Kubernetes and Docker . Excellent verbal and written communication skills in English.
Benefits
Acess to the latest technologies and equipment. A cool, pet-friendly working environment in the city center. Flexible work conditions with an option for a hybrid schedule. Training budget for self-development. A seniority program triggering additional vacation days. Compensation for health, sports, and well-being. A shuttle program offering you an opportunity to work from our other offices. Fun and memorable team events. Working alongside the coolest team members you’ll ever meet.
DevOps Analyst providing high quality and reliable solutions within multifuncional teams at technology - focused financial organization. Automating build and deployment solutions in a hybrid work environment.
Network & Datacenter Deployment Engineer at Cloudflare focused on building and expanding their global network infrastructure with collaboration across multiple engineering teams and vendors.
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.
Platform Engineer focusing on supporting CI/CD pipelines and Kubernetes at PCCW. Responsible for ensuring platform services' reliability and performance, with night - time support as needed.
Site Reliability Engineer at Bumble optimizing large - scale Linux environments and ensuring system stability. Focusing on troubleshooting, incident recovery, and performance tuning in complex infrastructures.
DevOps Manager overseeing engineering team developing scalable CI/CD processes for NVIDIA Networking products. Enhancing global R&D efficiency in a technology - focused company.
Senior DevOps Manager overseeing CI/CD processes for NVIDIA Networking products. Leading a team and collaborating with global teams to enhance R&D efficiency and infrastructure.
Join Operations Team as Senior Site Reliability Engineer driving operational excellence for cybersecurity solutions. Collaborate across teams to manage production platforms and optimize infrastructure.