Senior Site Reliability Engineer building and scaling cloud networking solutions at Lambda, the superintelligence cloud provider. Join us to automate and enhance networking infrastructure for AI applications.
Responsibilities
Help scale Lambda’s high performance multi-tenant cloud network
Contribute to the reproducible automation of network configuration and deployments
Contribute to the implementation and operations of Software Defined Networks
Help to deploy and manage Spine and Leaf networks
Ensure high availability of our network through observability, failover, and redundancy
Ensure clients have predictable networking performance through the use of network engineering and other applicable technologies
Help with deploying and maintaining network monitoring and management tools
Participate in on-call
Requirements
Have 5+ years of experience being SWE, SRE or Network Reliability Engineering
Been part of the implementation of production-scale networking projects
Experience being on-call and incident response management
Have experience building and maintaining Software Defined Networks (SDN), experience with OpenStack, Neutron, OVN
Are comfortable on the Linux command line, and have an understanding of the Linux networking stack
Have experience with multi-data center networks and hybrid cloud networks
Have Python programming experience and configuration management tools like Ansible
Have experience with CI/CD tools for deployment and GIT. Operated network environment with GitOps practices in place.
Experience with application lifecycle and deployments on Kubernetes
Benefits
Health, dental, and vision coverage for you and your dependents
Wellness and Commuter stipends for select roles
401k Plan with 2% company match (USA employees)
Flexible Paid Time Off Plan that we all actually use
DevOps Engineer supporting cloud modernization for the Department of the Air Force on the Cloud One contract. Involved in systems analysis, security practices, and collaboration with engineering teams.
Journeyman Cloud Operations Engineer maintaining cloud infrastructure across DoD organizations. Supporting DevSecOps and ensuring compliance with security requirements in a high - visibility program.
DevOps Engineer managing cloud - native platforms for Capgemini. Collaborating with development, data/ML, and security teams to deliver scalable solutions on Azure.
Head of IT & DevSecOps at JamLoop, managing internal technology and security improvements. Leading strategy and implementation of cloud infrastructure for efficiency and reliability.
I&E Maintenance and Reliability Engineer at LyondellBasell focused on asset maintenance strategies in a multidisciplinary environment. Collaborating for operational excellence and safety performance at the Pasadena facility.
Manager, DevOps & Cloud Infrastructure overseeing security and operational efficiency in a hybrid environment at Thomson Reuters. Leading teams to deliver secure solutions in on - premises and cloud setups.
DevOps Engineer responsible for building and maintaining the infrastructure of IONOS' AI platform. Collaborating on CI/CD pipelines and ensuring system optimization across various locations.
DevOps Engineer building and supporting cloud infrastructure at PointClickCare. Collaborate with senior engineers and software teams to enhance AI - enabled workloads and improve system reliability.
DevOps specialist working with Kubernetes and Terraform, ensuring project stability and efficiency for Convercus. Join a small, dynamic team in a hybrid work environment.
Cloud & DevOps Engineer at XTEL managing Azure infrastructure and deploying applications. Collaborating within an international team to drive technological excellence.