AI Infrastructure Engineer focusing on scalable backend systems for AI workflows in a fast-paced startup. Collaborating on reliability, data performance, and infrastructure for rapid growth.
Responsibilities
Design and implement scalable backend architectures for AI workloads (inference, orchestration, monitoring).
Own distributed job orchestration with Temporal and related systems.
Improve data pipeline performance by designing smarter caching strategies (e.g., file deduplication, hot/cold storage, Redis caching layers) to reduce redundant compute and API calls.
Build observability, monitoring, retries, and fault tolerance into all workflows.
Manage infrastructure reliability, incident response, and performance.
Develop tooling and platform infrastructure to support rapid growth.
Partner with ML engineers to bring models to production at scale.
Requirements
4+ years of backend engineering (Python is a must).
Strong background in distributed systems, job orchestration, and task queues.
Deep knowledge of concurrency, parallelism, and multithreading—including async/await, event loops, thread pools, synchronization primitives, deadlocks, and race conditions—is a must.
Hands-on experience with Temporal, Redis, Airflow, Celery, RabbitMQ (or similar).
Experience with LLM serving and routing fundamentals (rate limiting, streaming, load balancing, budgets).
Comfortable with containers & orchestration: Docker, Kubernetes.
Familiarity with cloud platforms (AWS/GCP) and IaC (Terraform).
Experience with multiple storage systems: S3, Postgres, MongoDB, Redis, and Elasticsearch.
Track record scaling systems in startups or fast-paced environments.
Understanding of deploying, monitoring, and optimizing AI/ML systems in production with strong CI/CD practices.
IT Infrastructure Specialist managing physical and virtual server environments for Premier League Studios. Ensuring robust workflows and high - performance infrastructure in a hybrid work setting.
Manager of Platform Engineering at a leading insurance company shaping the future of API platforms. Fostering innovation and collaboration while driving platform stability and resiliency.
Infrastructure Engineer responsible for building, monitoring, and securing IT infrastructure for NLACRC. Collaborates with IT personnel and external support to ensure robust infrastructure.
Infrastructure Engineering Intern working on cloud solutions at a global growth engine for commerce. Collaborating on secure, scalable systems and contributing to performance optimization.
Infrastructure Engineer supporting IT service management and implementing complex system solutions. Collaborating with business units and training junior team members in a hybrid environment.
Infrastructure Engineering Lead overseeing edge security initiatives for Lloyds Banking Group. Driving the development of security capabilities and mentoring engineering teams.
Lead Infrastructure Engineer focusing on web access protection and security strategies at Lloyds Banking Group. Managing infrastructure improvements and team leadership in enterprise environments.
Senior Infrastructure Engineer maintaining IT infrastructure and datacentre operations for Walkers Global. Installing, configuring, and troubleshooting various hardware and cloud services in a hands - on role.
Infrastructure Architect responsible for designing and implementing multi - cloud infrastructures. Collaborating with teams to ensure high availability, security, and cost efficiency in cloud environments.
Senior Database Administrator specializing in private cloud technologies for fintech company's modernization agenda. Focused on database platform engineering with MS SQL and PostgreSQL.