As an SRE at CellPoint Digital, you’ll be a key player in ensuring our payment platform runs reliably, securely, and at scale—processing thousands of payments per second
Working closely with our Product, Development, and Architecture teams, you’ll blend hands-on operational excellence with a software engineering mindset to drive automation, observability, and reliability across our global infrastructure
Requirements
Ensure the production environment runs smoothly, with a holistic view of system health
Build software and systems to manage infrastructure and applications
Drive improvements in reliability, quality, and delivery speed of our payment solutions
Measure and optimize system performance, always looking to innovate and get ahead of customer needs
Provide operational support and engineering expertise for large-scale, distributed systems
Collaborating with Product, Development, and Architecture to define and share SLAs, and improve system reliability
Partnering with our Release Manager to deploy and troubleshoot new versions of our platform and services
Participating in an on-call (Grafana IRM) rotation to respond to incidents impacting availability and supporting internal engineering teams
Preventing incidents through robust automation, monitoring, and proactive engineering
Running our modern stack: Google Cloud Platform, Kubernetes, Terraform, Github Actions, etc.
Designing, building, and maintaining core infrastructure that supports massive scale and high availability
Debugging production issues across services and infrastructure layers
Planning and executing infrastructure growth to meet future demand
Benefits
Competitive salary in a fast-growing start-up
Rewards & Recognition system
Opportunity for personal and professional growth in a dynamic industry
Work from anywhere in the world; we're a fully distributed company, and we provide the tools, culture, and support to make your work setup work for you
Occasional travel to Europe (UK, Copenhagen, Bulgaria)
DevOps Product Manager working on complex platform and infrastructure projects. Consulting on DevOps best practices and ensuring scalable, efficient digital ecosystems for clients.
Site Reliability Engineer optimizing large - scale Linux environments at Bumble Inc. Troubleshooting incidents and driving performance improvements on platforms such as Kafka and Kubernetes.
Senior DevOps Engineer at mylo, managing multi - cloud infrastructure and CI/CD pipelines. Promoting DevOps culture while ensuring compliance and automating system maintenance.
Lead Site Reliability Engineer at S&P Global's Cloud Engineering team. Responsible for designing and maintaining cloud infrastructure and ensuring the performance of cloud - based systems.
Site Reliability Engineer responsible for monitoring and improving the reliability of satellite operations infrastructure. Collaborating with teams to automate processes in a dynamic environment.
DevOps Analyst providing high quality and reliable solutions within multifuncional teams at technology - focused financial organization. Automating build and deployment solutions in a hybrid work environment.
Network & Datacenter Deployment Engineer at Cloudflare focused on building and expanding their global network infrastructure with collaboration across multiple engineering teams and vendors.
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.