Senior DevOps Engineer ensuring reliable and automated GCP-native infrastructure at Search Atlas. Collaborating across teams to enhance observability and streamline deployment processes.
Responsibilities
Administer and scale our GKE clusters to support a wide range of microservices, with a strong focus on resilience, cost optimization, and performance.
Contribute to the development of GitLab CI/CD pipelines and deployment automation with ArgoCD, supporting secure and traceable infrastructure workflows.
Manage infrastructure across GCP using Terraform, ensuring all systems are modular, scalable, and reproducible.
Implement and extend OpenTelemetry instrumentation. Build and maintain dashboards and alerts using Grafana, and Sentry to detect and resolve issues quickly.
Support PostgreSQL, Elasticsearch, and ClickHouse operations, helping monitor performance, ensure uptime, and reduce cost across data layers.
Participate in on-call rotations, troubleshoot production issues, and contribute to disaster recovery and high-availability strategies.
Partner with backend, frontend, and QA teams to improve infrastructure reliability, streamline deployments, and ensure platform stability.
Requirements
5+ years of experience in DevOps or SRE roles working in production environments.
Strong proficiency with Kubernetes (GKE preferred) and GitOps workflows using ArgoCD.
Deep knowledge of GCP infrastructure and Terraform-based IaC practices.
Experience with OpenTelemetry for distributed tracing and instrumentation.
Expertise in Grafana, Datadog, and Sentry for observability and monitoring.
Operational knowledge of PostgreSQL, Elasticsearch, and ClickHouse.
Strong troubleshooting skills and experience with incident resolution in production systems.
Effective communication skills and ability to collaborate across teams.
Benefits
15 Days Paid Time Off + Christmas Day + New Year's Day Paid Off
Development Operations Engineer supporting enterprise application development in Java and/or C. Ensuring high availability and operational excellence in modern payment solutions.
Site Reliability Engineer designing and supporting Kubernetes environments for F5's UDF platform. Collaborating with cross - functional teams to ensure reliability and operational excellence.
Senior Site Reliability Engineer ensuring operational excellence for multi - datacenter infrastructure at F5. Developing automation tools and APIs in Python and Go.
DevOps Engineer needed to develop a new OpenXDR solution on AWS, processing security data from multiple sources. Join a leading cybersecurity company in Slovakia.
DevOps Engineer at Castalia Systems automating and optimizing toolchain and CI/CD pipelines. Designing Azure infrastructure and ensuring collaboration between development and operations teams.
Senior DevOps Engineer managing Kubernetes and AI - driven workflows at Hex Trust. Supporting blockchain infrastructure while implementing best DevOps practices.
Lead DevSecOps Software Developer at Leidos enhancing automation for air traffic operations. Collaborating on safety - critical systems within a hybrid work environment.