Cloud DevOps Engineer focusing on automated infrastructure across cloud and on-prem environments. Managing Kubernetes, CI/CD pipelines, and collaborating with development and operations teams.
Responsibilities
Design, implement, and maintain CI/CD pipelines using Jenkins, GitLab CI, Argo CD, etc.
Automate infrastructure provisioning and configuration using Terraform, Ansible, and Helm.
Deploy, configure, and manage Kubernetes clusters in cloud and on-premises environments (EKS, AKS, GKE, Rancher, RKE2, k3s, OpenShift).
Enforce Kubernetes security best practices (RBAC, PodSecurity, secrets, network policies).
Monitor and tune Kubernetes workloads for performance and reliability.
Administer, operate, and troubleshoot distributed database systems (e.g., Cassandra, MongoDB, Cockroach DB, etcd) within Kubernetes, ensuring high availability, data consistency, and performance.
Ensure high availability, scalability, backup/recovery, and disaster recovery strategies for databases.
Implement observability stacks (Prometheus, Grafana, ELK, Zabbix, etc.) for infrastructure and applications.
Partner with dev teams to design scalable deployment patterns and troubleshoot pipeline/build/deploy issues.
Maintain detailed technical documentation for environments, playbooks, and architectural decisions.
Mentor peers and team members in DevOps tools, Kubernetes, and cloud-native practices.
Requirements
3+ years managing Kubernetes in production
Expertise with container tools (Docker, Podman) and orchestration (Kubernetes, Helm).
Strong CI/CD experience with GitLab, Jenkins, Argo CD, and GitOps workflows.
Proficient in Infrastructure as Code (Terraform, CloudFormation, Ansible).
Deep knowledge of managing distributed databases in Kubernetes including StatefulSets, PVCs, dynamic volume provisioning. Backup, recovery, scaling, and clustering techniques.
Cloud experience in on-prem, AWS, GCP, Azure or OpenStack; experience with hybrid/multi-cloud preferred.
Familiarity with service meshes and Kubernetes networking (Istio, Calico, Cilium).
Proficient in Bash, Python, or similar scripting languages.
Strong analytical and troubleshooting abilities across app, infra, and DB layers.
Clear communication and ability to collaborate across development, QA, security, and operations.
Self-motivated, detail-oriented, and comfortable in high-paced, on-call environments.
Excellent documentation habits and focus on operational excellence.
Familiarity with compliance standards (FIPS, FedRAMP, FISMA).
Certifications in Kubernetes (CKA/CKAD), AWS/GCP, or Terraform.
Benefits
Competitive Salary & Incentives: We offer a competitive compensation package with and pre-IPO equity to reward your hard work and dedication.
Health & Wellness: Comprehensive medical, dental, and vision insurance plans to ensure you and your family stay healthy and covered.
Paid Time Off (PTO): Enjoy a generous PTO policy that includes vacation days, sick leave, and paid holidays to recharge and take care of personal matters.
Flexible Work Environment: We understand the importance of work-life balance. Enjoy the flexibility of remote work, and hybrid option to create the work schedule that works best for you.
Professional Development: We believe in continuous learning. Access to training, certifications, and educational resources to help you grow in your career and stay ahead of industry trends.
Employee Recognition: We celebrate achievements both big and small, with regular recognition programs and awards that highlight your contributions to our collective success.
Collaborative Culture: Be part of a dynamic, inclusive, and supportive team where innovation and collaboration are at the heart of everything we do.
Parental Leave: Generous parental leave policies to support you during life's important moments.
Senior Site Reliability Engineer at Broadridge managing infrastructure design and operational support. Collaborating with teams to improve automation, performance, and reliability of services in a hybrid environment.
DevSecOps Engineer building and maintaining Azure DevOps cloud applications with API backend. Roles include developing CI/CD pipeline and automating backend tasks.
Reliability Engineer II at Cargill applying technical expertise to enhance process and asset reliability. Collaborating with teams to execute engineering strategies for equipment optimization in a salt mine setting.
Reliability Engineer applying technical knowledge to enhance process and asset reliability. Partnering with teams to implement reliability excellence activities and predictive maintenance programs.
Cloud & DevOps Engineer designing and maintaining infrastructure as code in cloud environments. Collaborating on application development interacting with APIs and AI solutions.
Senior Business Systems Analyst assisting in PLM Dev Ops at Arthrex. Involves supporting automation in deployment, testing, and monitoring of PLM systems.
Principal Software Engineer leading DevSecOps strategies for automated delivery and security across product engineering. Innovating CI/CD pipelines and embedding security practices in software delivery.
DevSecOps Engineer responsible for embedding security controls in CI/CD at Keyloop. Collaborate with engineering teams to integrate security in build and deployment workflows.
DevOps Engineer modernizing infrastructure for a fintech company focused on empowering e - commerce businesses. Engaging in hands - on work with GCP and Kubernetes to establish reliable, efficient deployment pipelines.
DevSecOps Engineer supporting AI - enabled financial compliance initiative for the Department of War. Responsible for designing secure infrastructure and collaborating with cross - disciplinary teams.