Senior Site Reliability Engineer building and scaling cloud networking solutions at Lambda, the superintelligence cloud provider. Join us to automate and enhance networking infrastructure for AI applications.
Responsibilities
Help scale Lambda’s high performance multi-tenant cloud network
Contribute to the reproducible automation of network configuration and deployments
Contribute to the implementation and operations of Software Defined Networks
Help to deploy and manage Spine and Leaf networks
Ensure high availability of our network through observability, failover, and redundancy
Ensure clients have predictable networking performance through the use of network engineering and other applicable technologies
Help with deploying and maintaining network monitoring and management tools
Participate in on-call
Requirements
Have 5+ years of experience being SWE, SRE or Network Reliability Engineering
Been part of the implementation of production-scale networking projects
Experience being on-call and incident response management
Have experience building and maintaining Software Defined Networks (SDN), experience with OpenStack, Neutron, OVN
Are comfortable on the Linux command line, and have an understanding of the Linux networking stack
Have experience with multi-data center networks and hybrid cloud networks
Have Python programming experience and configuration management tools like Ansible
Have experience with CI/CD tools for deployment and GIT. Operated network environment with GitOps practices in place.
Experience with application lifecycle and deployments on Kubernetes
Benefits
Health, dental, and vision coverage for you and your dependents
Wellness and Commuter stipends for select roles
401k Plan with 2% company match (USA employees)
Flexible Paid Time Off Plan that we all actually use
Join Boeing AvionX as a Software DevOps Engineer driving automation and CI/CD pipelines for cloud - native systems. Lead initiatives improving deployment pipelines and mentor engineering team.
Senior SRE responsible for ensuring system reliability and performance at Aggrandize. Collaborating with cross - functional teams and implementing SRE best practices.
Lead Oracle ERP Enterprise Architect focusing on DevSecOps and cloud - native modernization for a defense - related company. Transitioning monolithic applications to microservices and maintaining CI/CD pipelines.
Lead Oracle ERP Enterprise Architect supporting DevSecOps implementation and modernization initiatives at Credence. Overseeing CI/CD pipelines in cloud environments for defense and health organizations.
Reliability Engineer responsible for RCM program and maintenance initiatives in mining industry. Enhancing equipment reliability and collaborating with various teams.
Lead SRE for Data & Analytics platforms at Deloitte. Championing reliability, improving stability, and driving automation in a hybrid environment based in London.
RDS Engineer supporting enterprise - grade RDS environments for Wells Fargo. Building and tuning Windows Server RDS environments and collaborating with security and networking teams.
Senior DevSecOps Engineer managing Azure to AWS migration for AccuSourceHR. Leading cloud architecture, CI/CD implementation, and ensuring security and reliability in production systems.
Site Reliability Engineer ensuring infrastructure reliability and performance for Hornetsecurity. Collaborating across product, business, and infrastructure teams in a critical environment.
Senior DevOps Engineer developing core infrastructure supporting Shelf products. Focused on building reliable, secure, and scalable systems in hybrid work environment.