Cloud DevOps Engineer focusing on automated infrastructure across cloud and on-prem environments. Managing Kubernetes, CI/CD pipelines, and collaborating with development and operations teams.
Responsibilities
Design, implement, and maintain CI/CD pipelines using Jenkins, GitLab CI, Argo CD, etc.
Automate infrastructure provisioning and configuration using Terraform, Ansible, and Helm.
Deploy, configure, and manage Kubernetes clusters in cloud and on-premises environments (EKS, AKS, GKE, Rancher, RKE2, k3s, OpenShift).
Enforce Kubernetes security best practices (RBAC, PodSecurity, secrets, network policies).
Monitor and tune Kubernetes workloads for performance and reliability.
Administer, operate, and troubleshoot distributed database systems (e.g., Cassandra, MongoDB, Cockroach DB, etcd) within Kubernetes, ensuring high availability, data consistency, and performance.
Ensure high availability, scalability, backup/recovery, and disaster recovery strategies for databases.
Implement observability stacks (Prometheus, Grafana, ELK, Zabbix, etc.) for infrastructure and applications.
Partner with dev teams to design scalable deployment patterns and troubleshoot pipeline/build/deploy issues.
Maintain detailed technical documentation for environments, playbooks, and architectural decisions.
Mentor peers and team members in DevOps tools, Kubernetes, and cloud-native practices.
Requirements
3+ years managing Kubernetes in production
Expertise with container tools (Docker, Podman) and orchestration (Kubernetes, Helm).
Strong CI/CD experience with GitLab, Jenkins, Argo CD, and GitOps workflows.
Proficient in Infrastructure as Code (Terraform, CloudFormation, Ansible).
Deep knowledge of managing distributed databases in Kubernetes including StatefulSets, PVCs, dynamic volume provisioning. Backup, recovery, scaling, and clustering techniques.
Cloud experience in on-prem, AWS, GCP, Azure or OpenStack; experience with hybrid/multi-cloud preferred.
Familiarity with service meshes and Kubernetes networking (Istio, Calico, Cilium).
Proficient in Bash, Python, or similar scripting languages.
Strong analytical and troubleshooting abilities across app, infra, and DB layers.
Clear communication and ability to collaborate across development, QA, security, and operations.
Self-motivated, detail-oriented, and comfortable in high-paced, on-call environments.
Excellent documentation habits and focus on operational excellence.
Familiarity with compliance standards (FIPS, FedRAMP, FISMA).
Certifications in Kubernetes (CKA/CKAD), AWS/GCP, or Terraform.
Benefits
Competitive Salary & Incentives: We offer a competitive compensation package with and pre-IPO equity to reward your hard work and dedication.
Health & Wellness: Comprehensive medical, dental, and vision insurance plans to ensure you and your family stay healthy and covered.
Paid Time Off (PTO): Enjoy a generous PTO policy that includes vacation days, sick leave, and paid holidays to recharge and take care of personal matters.
Flexible Work Environment: We understand the importance of work-life balance. Enjoy the flexibility of remote work, and hybrid option to create the work schedule that works best for you.
Professional Development: We believe in continuous learning. Access to training, certifications, and educational resources to help you grow in your career and stay ahead of industry trends.
Employee Recognition: We celebrate achievements both big and small, with regular recognition programs and awards that highlight your contributions to our collective success.
Collaborative Culture: Be part of a dynamic, inclusive, and supportive team where innovation and collaboration are at the heart of everything we do.
Parental Leave: Generous parental leave policies to support you during life's important moments.
Senior Site Reliability Engineer managing the reliability and operational health of the Loan Origination System for a fintech company. Collaborating with engineering teams in Brazil and the US to improve system reliability.
Cloud Engineer working with Azure DevOps and digital transformation in a global team at EY. Collaborating on cloud engineering projects and supporting CI/CD pipeline development.
DevOps Engineer creating better conditions for developers in Saab's defence technology. Collaborating with developer teams for effective continuous development and delivery of software.
Ingénieur Infrastructure DevOps chez Bull, renforçant l'équipe AdminLab Echirolles. Travailler sur des infrastructures Linux et des pratiques d'automatisation dans un environnement HPC.
Product Quality & Reliability Engineer developing quality/reliability standards for Applied Materials. Design methods for testing products and analyze operational data in a supportive team environment.
DevOps System Engineer creating and managing infrastructure for ESET's global SaaS service. Collaborating with tech teams to maintain secure and stable operations.
Provides expertise in business applications design and functionality. Supports users and validates technical designs for alignment with business needs.
Senior Site Reliability Engineer supporting the reliability and performance of Broadridge’s fintech platform. Collaborating with senior engineers on automation, infrastructure, and production stability.
DevOps Engineer at Mindera focusing on Windows environments and Azure cloud solutions. Involves system modernization, automation, and migration projects with collaborative teams.
DevSecOps Lead supporting Synthesized's cloud automation strategy with a focus on security and compliance. Collaborating closely with development teams to shape cloud architecture and enhance deployment processes.