Senior Software Engineer building AI inference systems for AION's multi-cloud compute platform. Leading design and development of scalable managed services and orchestration systems for GPU workloads.
Responsibilities
Design and architect AION's multi-cloud compute platform, building abstraction layers that unify diverse cloud providers (AWS, GCP, Azure, bare-metal data centers)
Work directly with cloud providers to expand AION's compute pool—understanding pricing, availability zones, GPU types, and capacity planning
Build and maintain the AION managed services
Understand and abstract cloud provider differences in storage (block, object, file systems), networking (VPCs, subnets, security groups), and compute resources
Design composable platform components that enable forward deployments and promote reusability across AION's infrastructure stack
Own end-to-end development of managed services on the compute platform—from design and architecture through execution and production monitoring
Build scalable orchestration systems for GPU workloads, container scheduling, and resource allocation
Develop robust APIs and control planes for compute lifecycle management (provisioning, scaling, termination)
Lead technical discussions on platform reliability, performance optimization, and cost efficiency
Execute on peripheral platform services including billing systems, usage accounting, observability infrastructure, and compliance tooling
Build monitoring and telemetry systems for compute utilization, cost tracking, and performance metrics
Establish engineering standards for platform development including code reviews, quality gates, and testing practices
Mentor engineers on infrastructure best practices and distributed systems design
Requirements
4+ years of experience building and scaling complex backend systems, cloud infrastructure, or distributed platforms
Strong understanding of multi-cloud architectures and experience working with AWS, GCP, or Azure at scale
Deep knowledge of cloud abstractions: compute (EC2, GCE, VMs), storage (S3, GCS, EBS), networking (VPCs, load balancers, security groups)
Proficiency in Golang strongly preferred; Python, Rust, or other systems languages a plus
Experience with Kubernetes, container orchestration, and infrastructure-as-code (Terraform, Pulumi, CloudFormation)
Solid understanding of distributed systems principles, consensus algorithms, and state management
Experience building APIs, control planes, and platform services for infrastructure management
Familiarity with databases (PostgreSQL, Redis, etcd), message queues (Kafka, RabbitMQ), and event-driven architectures
Knowledge of GPU orchestration, AI/ML workloads, or HPC systems is highly desirable
Experience with observability tools (Prometheus, Grafana, Datadog) and distributed tracing
Understanding of cloud billing models, cost optimization strategies, and resource scheduling
Benefits
**Preferred Attributes:**
High ownership, self driven and bias for action.
Strong strategic thinking and ability to connect technical decisions to business impact.
Excellent communication and mentoring skills.
Thrives in ambiguity, fast-paced environments, and early-stage startup culture.
**Why Join AION?**
Work directly with high-pedigree founders shaping technical and product strategy.
Build infrastructure powering the future of AI compute globally.
Significant ownership and impact with equity reflective of your contributions.
Competitive compensation, flexible work options, and wellness benefits
Senior Tech Lead overseeing Qliro’s Pay Later systems and services. Leading engineering teams and shaping technical direction for scalable solutions across the Nordics.
Principal Engineer leading the technical direction at Terrific, a social commerce platform. Building AI - powered, real - time shopping experiences with a modern tech stack.