DevOps Engineer for Payment Orchestration Platform ensuring smooth AWS and Google Cloud operations. Building, deploying, and scaling global payments infrastructure while collaborating across teams.
Responsibilities
Design, build, and maintain scalable, secure, and highly available infrastructure on AWS and Google Cloud Platform (GCP).
Collaborate closely with software engineers to automate CI/CD pipelines for Java-based services and APIs.
Implement and manage Infrastructure as Code (IaC) using tools such as Terraform or CloudFormation.
Monitor system health and performance using observability tools (e.g., Prometheus , Grafana , CloudWatch , Stackdriver).
Drive continuous improvements in system reliability, fault tolerance, and disaster recovery processes.
Ensure compliance and security best practices across all environments, including PCI DSS requirements for payment data.
Contribute to performance tuning and optimization of Java-based applications and services.
Manage containerized workloads using Docker and Kubernetes (EKS/GKE).
Collaborate with cross-functional teams (Product, Security, Engineering) to enhance deployment workflows and incident response.
Requirements
3–5+ years of hands-on experience as a DevOps Engineer , Site Reliability Engineer , or in a similar role.
SRE responsible for ensuring reliability and performance of IT systems at a digital transformation company specializing in public sector efficiency. Collaborating on system health, incident response, and automation tasks.
DevOps Senior role at Beyond Soluções managing CI/CD for .NET and Kubernetes applications. Collaborating on cloud solutions while fostering a culture of innovation and quality.
Senior Software Engineer at PayPal managing cloud infrastructure and DevOps solutions. Delivering complete SDLC solutions and guiding engineering teams for scalable and reliable services.
Senior Site Reliability Engineer at Diligent leading reliability, automation, and observability across cloud infrastructure. Build tools for incident response and enhance performance in fast - paced environments.
Perception Deployment Engineer deploying deep learning models on embedded systems at Caterpillar. Collaborating with cross - functional teams for integration and optimization of perception modules in vehicles.
Principal Site Reliability Engineer at AT&T required to design scalable solutions for critical operations with minimal downtime. Collaborating with teams to monitor and improve system performance in cloud environments.
DevOps Engineer managing AI SaaS infrastructure at a high - growth European company. Supporting AI model deployment and ensuring platform security and compliance with multiple systems integration.
Engineering Manager leading teams for observability platforms at LexisNexis. Owns operational excellence across software delivery lifecycle in Raleigh, NC.
Reliability Engineer optimizing site facility infrastructure and utility systems at Roche. Conducting root cause analyses and developing maintenance plans to enhance reliability and efficiency.
DevOps SME designing, implementing, and operating multi - cloud platforms for The Missing Link. Collaborating with engineering, security, and operations teams while embedding DevOps best practices.