Senior Infrastructure Engineer at Provable designing and maintaining scalable GCP infrastructure. Collaborating with developers and automating processes to enhance security and efficiency.
Responsibilities
Design, implement, and maintain reliable and scalable GCP and GKE infrastructure.
Deliver metrics and logging pipelines for infrastructure and applications using native GCP and managed Grafana and Elastic logging services.
Manage and optimize database technologies, with a focus on Postgres.
Collaborate with developers to design reliable applications, and ensure seamless deployment of new features and updates.
Develop and maintain automation to improve efficiency and reduce manual intervention.
Implement security best practices to protect services and align with industry standards.
Conduct root cause analysis of incidents and implement preventive measures to avoid recurrence.
Participate in on-call rotations to provide 24/7 support for production systems.
Establish and document cloud infrastructure standards and practices.
Requirements
5+ years of experience designing and operating cloud infrastructure
Expert knowledge of infrastructure- and configuration-as-code, particularly Terraform, Ansible, and Kubernetes.
Expert knowledge of managed Kubernetes design and operation, including Helm, Istio, ingress, node pools, security, and deployment pipelines,
Experience in implementing and managing monitoring solutions, including Prometheus and Grafana.
Strong background in automation using Ansible, Terraform, Bash, Git, and GitHub workflows
Proven track record of leading technical initiatives and mentoring team members
Excellent communication skills with the ability to explain complex technical concepts
Deep understanding of cloud security best practices and compliance requirements
Benefits
Monthly budget for remote work expenses (home office setup & supplies, transportation, fitness & personal well-being, continued learning, etc.)
Comprehensive, top-tier healthcare coverage
Flexible vacation policy
Opportunity to attend major industry conferences and global events
Regular team off-sites and retreats
Professional development budget for cloud certifications and training
Cloud Infrastructure Engineer responsible for Langfuse Cloud operations and observability at scale. Managing AWS and ClickHouse deployment to ensure performance and cost optimization.
Site Infrastructure Engineer managing HVAC and utility systems at SABIC. Overseeing maintenance, project activities, and long - term asset strategies for operational efficiency.
Key engineer developing and operating Web Application Firewall (WAF) platforms at Lloyds Banking Group. Enhancing security and performance while working with modern engineering practices.
Lead Infrastructure Engineer driving Edge Security capabilities for Lloyds Banking Group. Focusing on web access protection, Zero Trust architectures, and modern security engineering approaches.
Senior System Administrator & Infrastructure Engineer managing reliable infrastructure and driving DevOps practices at IMAGO. Collaborating with development teams and providing technical guidance to ensure best practices.
Infrastructure Engineer maintaining high availability of systems at mortgage platform provider Pylon. Focus on developer productivity and codebase quality with instant feedback from peers.
Infrastructure Systems Engineer II managing production application support for Conduent. Collaborating on ITIL processes and incident management while working in a 24/7 environment.
OT Cybersecurity Specialist responsible for secure IT - OT infrastructures in industrial operations. Engaging in secure deployments, integrating cybersecurity frameworks, and providing expert support.
Ingeniero de Infraestructura y Seguridad colaborando en el diseño de arquitecturas seguras en CRG Solutions. Integrando buenas prácticas de ciberseguridad y gestionando incidentes en entornos Windows y Linux.
Senior Infrastructure Engineer managing global IT infrastructure for aviation solutions, focusing on VMware, Nutanix, and Windows Server environments. Collaborating with teams to ensure high availability and optimal performance in a hybrid work model.