Staff Backend Engineer leading engineering efforts for Cloudinary's AI algorithm platform. Focusing on system architecture, scalability, and GPU management.
Responsibilities
Own the architecture, stability, scalability, and performance of the system.
Design and implement platform features that support both synchronous low-latency and asynchronous compute-heavy algorithm execution.
Enhance GPU management, scheduling, and resource allocation for optimal performance and cost-efficiency.
Ensure robust Kubernetes-based deployment and observability for a highly dynamic system.
Act as the technical bridge between Research and Application teams by translating requirements into scalable system designs.
Collaborate closely with algorithm developers to streamline model deployment processes.
Partner with backend engineers (primarily working in Ruby and Go) to integrate the research group algorithms into Cloudinary services.
Advocate for high standards in code quality, observability, testing, and security.
Guide engineering integration efforts when consuming the different platform APIs.
Provide mentorship, support, and best practices to other engineers interacting with the platform.
Take part in general R&D efforts, supporting a broader production environment.
Contribute to the evolution of our platform to support a wider range of algorithmic workloads and model types.
Help shape tooling and infrastructure for model versioning, rollout, monitoring, and testing.
Collaborate with DevOps and Infrastructure teams to maintain operational excellence, system observability, and robust infrastructure support.
Requirements
8+ years of experience in software engineering, with 3+ years working on infrastructure/platforms involving ML/AI, GPU, or data-heavy systems.
Proficiency in Python and familiarity with backend languages such as Ruby and/or Go.
Strong understanding of Kubernetes internals and experience running GPU workloads in production environments.
In-depth knowledge of AWS services.
Experience architecting systems that support both real-time and asynchronous processing pipelines.
Familiarity with the ML lifecycle and MLOps practices, including CI/CD for models, monitoring, and rollback strategies.
Benefits
Opportunity to build and scale a one-of-a-kind platform powering state-of-the-art media algorithms.
Collaborate with world-class research, engineering, and product teams.
Have a direct impact on product experiences used by millions of developers and end-users.
Be part of a culture that values creativity, autonomy, and continuous improvement.
Staff Rust Software Engineer responsible for designing and developing infotainment systems. Collaborating on high performance HMI development for Ford's electric vehicles team.
Lead Backend Engineer at Polarsteps, developing a travel app for 19 million users. Responsible for platform engineering leadership and scalable architecture decision - making.
C#/.NET Software Engineer developing high - quality software solutions for Euronet's E - Commerce ecosystem. Collaborating with teams to design and deliver robust applications using Microsoft technologies.
Application Support Analyst ensuring optimal performance and reliability of production systems for a digital solutions provider. Collaborating with development, DevOps, and QA teams to enhance user satisfaction.
Senior Full - Stack Developer designing and developing solutions for Equisoft’s product lineup. Collaborating with cross - functional teams in a hybrid working environment to deliver innovative digital solutions.
Full Stack Developer evolving applications and services at Amo Promo utilizing Python and ReactJS while ensuring product quality and collaboration with the team.
Join KIPMI Software as a Java Principal Engineer leading the development of digital trust technologies. Collaborate across teams while employing cutting - edge tools and best practices.
Senior developer managing critical Microsoft systems at SBM Technology. Ensuring stability of applications and data management in Oracle environments with high reliability demands.
Senior Backend Developer responsible for high - performance .NET Core applications at a financial institution. Collaborating on cloud and on - premises solutions with a focus on security and scalability.
Develop and maintain backend services and RESTful APIs using Node.js and Python; implement database persistence (PostgreSQL), caching (Redis), and asynchronous messaging (RabbitMQ). Hybrid role in Goiânia with cross - functional collaboration across product, front - end and infrastructure teams.