DevOps & Kubernetes Engineer at AI software startup near Porto, managing Kubernetes clusters for ML workloads and collaborating on infrastructure solutions.
Responsibilities
Design, deploy, and manage production-grade Kubernetes clusters for ML and microservice workloads
Maintain and optimize container orchestration, including service mesh, network policies, and resource allocation
Oversee CI/CD pipelines using tools like GitHub Actions, GitLab CI, and Terraform
Manage Docker image lifecycle and enforce security best practices
Monitor infrastructure health using Prometheus, Grafana, and centralized logging solutions
Collaborate on Infrastructure as Code (IaC) and ensure scalable, reproducible deployments
Support GPU-based workloads and optimize GPU resource utilization for LLM agents
Maintain Linux-based cloud servers, implement security protocols, and manage DNS, VPNs, and firewalls
Troubleshoot Python microservices and contribute to automation and monitoring setups
Implement model serving and orchestration pipelines (MLflow, Kubeflow, etc.)
Ensure high availability and disaster recovery strategies across systems
Requirements
3-7+ years of advanced hands-on experience with Kubernetes administration, including networking, storage, and security
Proficient in Docker, multi-stage builds, and image lifecycle management
Strong Linux system administration skills (Ubuntu or RHEL-based systems)
Experience with cloud platforms, ideally Google Cloud Platform (GCP)
Solid understanding of CI/CD pipelines using tools like GitHub Actions, GitLab CI, or Jenkins
Familiarity with Infrastructure as Code (Terraform, Ansible) and GitOps workflows
Confident in managing GPU workloads and ML/LLM-serving infrastructure
Experience with monitoring and observability tools such as Prometheus, Grafana, ELK/EFK stack
Comfortable with Python microservices and ML workflow troubleshooting
Fluent English skills at C1 or above
Benefits
Competitive Salary: Commensurate with your experience and contributions.
Flexible Work Setup: On-site collaboration in Porto, with the option for full remote work based on strong performance after onboarding.
Relocation Support: For your on-site onboarding or if you decide to move to Porto, you receive support with logistics, housing and onboarding connections to make it smooth.
Training & Growth Budget: Set aside for conferences, courses, and certifications.
Daily Meal Subsidy: Enjoy lunch on the company when working from the office.
Team Events: From BBQs to game nights and a Christmas party, with the first drinks on the house.
Onboarding Buddy: You won’t be left alone—get paired with someone who helps you ramp up quickly.
DevOps Engineer supporting cloud modernization for the Department of the Air Force on the Cloud One contract. Involved in systems analysis, security practices, and collaboration with engineering teams.
Journeyman Cloud Operations Engineer maintaining cloud infrastructure across DoD organizations. Supporting DevSecOps and ensuring compliance with security requirements in a high - visibility program.
DevOps Engineer managing cloud - native platforms for Capgemini. Collaborating with development, data/ML, and security teams to deliver scalable solutions on Azure.
Head of IT & DevSecOps at JamLoop, managing internal technology and security improvements. Leading strategy and implementation of cloud infrastructure for efficiency and reliability.
I&E Maintenance and Reliability Engineer at LyondellBasell focused on asset maintenance strategies in a multidisciplinary environment. Collaborating for operational excellence and safety performance at the Pasadena facility.
Manager, DevOps & Cloud Infrastructure overseeing security and operational efficiency in a hybrid environment at Thomson Reuters. Leading teams to deliver secure solutions in on - premises and cloud setups.
DevOps Engineer responsible for building and maintaining the infrastructure of IONOS' AI platform. Collaborating on CI/CD pipelines and ensuring system optimization across various locations.
DevOps Engineer building and supporting cloud infrastructure at PointClickCare. Collaborate with senior engineers and software teams to enhance AI - enabled workloads and improve system reliability.
DevOps specialist working with Kubernetes and Terraform, ensuring project stability and efficiency for Convercus. Join a small, dynamic team in a hybrid work environment.
Cloud & DevOps Engineer at XTEL managing Azure infrastructure and deploying applications. Collaborating within an international team to drive technological excellence.