Perform configuration, management and troubleshooting activities in virtualization and container environments (hypervisors, virtual machines, virtualized switches (SDN), physical datastores (LUN) and virtualized datastores (VSAN, CEPH and FusionStorage), PODs and containers, servers, storage systems, etc.)
Execute acceptance tests and functional validation of new elements and systems
Work with responsible teams to validate and ensure integrations with monitoring and inventory tools
Document operational procedures and propose improvements to technical processes
Monitor performance indicators and adjust configurations to optimize resources and prevent bottlenecks
Troubleshoot Linux and Windows Server operating systems (bare metal and virtualized)
Provide technical support for NFVi infrastructure evolution projects, such as upgrades, migrations and integrations
Collaborate with technical teams to resolve issues and continuously improve the environments
Requirements
Bachelor's degree preferred in Telecommunications Engineering, Information Technology or Computer Systems
Experience operating and troubleshooting medium- to large-scale virtualized or NFVi environments
Experience with rack and blade servers from major vendors (Huawei, Dell, HPE, Nokia) and storage systems (Huawei and HPE)
Knowledge of hypervisors and virtualization platforms such as VMware, Red Hat OpenStack, Red Hat CoreOS (OpenShift), Huawei FusionSphere and FusionStage
Knowledge of Linux operating systems (RHEL, CentOS, Debian) and Windows Server
Familiarity with datacenter monitoring and management tools (NMS, element managers), such as OpenManage, NADCM, OneView and eSight
Advantageous: industry certifications such as VMware VCP (Data Center/NSX), Red Hat RHCSA, Microsoft AZ-800 and AZ-801, ITIL Intermediate, COBIT, SCRUM, Lean Six Sigma and Yellow Belt
Advantageous: experience troubleshooting networking in mainstream operating systems (bare metal and VMware virtual environments)
Advantageous: experience with automation and scripting, such as Shell, Ansible, PowerCLI, Perl and Python
Advantageous: knowledge of AI applications for predictive failure analysis and operations automation
Advantageous: experience with Infrastructure as Code (IaC) using tools such as Terraform and Ansible
Advantageous: experience with modern observability and telemetry tools such as Prometheus, Grafana and the ELK Stack
Advantageous: knowledge of security best practices for virtualized environments, including hypervisor hardening and access management
Advantageous: familiarity with data tools such as Power BI and Power Automate for performance analysis and automation
Advantageous: basic knowledge of container and orchestration technologies such as Docker and Kubernetes
Intermediate English (reading, speaking and writing)
Benefits
Flexible Benefits Program
Medical and Dental Assistance *
Medication Benefit *
Wellhub (formerly Gympass) *
Meal and/or Food Vouchers
Financial Wellness Program
Private Pension Plan
Mobile phone with unlimited data and voice allowance
Partnerships and discounts with over 3,000 companies and institutions, including discounts on your electricity bill and broadband internet
Online English course extendable to one family member or friend
Internal Training and Development Program
Profit Sharing
"My First Benefit" - assistance for children up to 2 years old
Daycare reimbursement (for fathers or mothers)
Flexible work models and schedules
Happy Day - day off during your birthday month
Extended leaves for maternity, paternity, marriage and adoption
Network & Datacenter Deployment Engineer at Cloudflare focused on building and expanding their global network infrastructure with collaboration across multiple engineering teams and vendors.
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.
Platform Engineer focusing on supporting CI/CD pipelines and Kubernetes at PCCW. Responsible for ensuring platform services' reliability and performance, with night - time support as needed.
Site Reliability Engineer at Bumble optimizing large - scale Linux environments and ensuring system stability. Focusing on troubleshooting, incident recovery, and performance tuning in complex infrastructures.
Senior DevOps Manager overseeing CI/CD processes for NVIDIA Networking products. Leading a team and collaborating with global teams to enhance R&D efficiency and infrastructure.
DevOps Manager overseeing engineering team developing scalable CI/CD processes for NVIDIA Networking products. Enhancing global R&D efficiency in a technology - focused company.
Join Operations Team as Senior Site Reliability Engineer driving operational excellence for cybersecurity solutions. Collaborate across teams to manage production platforms and optimize infrastructure.
Software Developer - DevOps System Administrator working within the SCMT team to enhance software application efficiency. Collaborating on tools and scripts for application lifecycle management.