Lead Software Engineer responsible for DevOps best practices and Azure ecosystem for AI solutions. Collaborating with teams on software design, development, and mentorship.
Responsibilities
Build and maintain Infrastructure, CI/CD process and monitoring for AI solutions developed using C#, .Net and Azure.
Lead the design and development of DevOps solutions and infrastructure ensuring they meet business needs and technical requirements.
Provide technical leadership and mentorship to engineering team, fostering a collaborative and high-performance environment.
Collaborate with product management, design, and quality assurance teams to ensure seamless integration and delivery of software solutions.
Troubleshoot and resolve complex technical issues, ensuring high-quality solutions that are scalable, reliable, and performant.
Ensure adherence to Devops best practices, including industry standards, testing, and documentation.
Take ownership of key features and components, driving their design and implementation from concept to delivery.
Continuously assess and improve engineering processes, AI advancements, automation, recommending new tools and technologies to enhance productivity and software quality.
Participate in code reviews, ensuring that the team follows best practices and maintains high standards for Azure infrastructure and pipelines.
Requirements
7+ years of hands-on experience with DevOps.
Deep understanding of modern CI/CD, industry best practices and tooling.
Sound understanding of Azure cloud services, infrastructure, pipelines, scalability and architecture.
Experience deploying solutions using Azure OpenAI, Azure Foundry, model versioning & quota mgmt.
Experience with RAG pipelines (Azure AI Search, embedding pipelines, indexing jobs).
Exposure to Agentic frameworks (Semantic Kernel, MCP/A2A plugins, etc).
Observability design (App Insights/Azure Monitor, OpenTelemetry) and incident response.
Good understanding of Managed Identityfirst, Key Vault integration, secure network boundaries, Azure Policy, RBAC, networking, private endpoints.
Solid experience of Azure Bicep, PowerShell scripts and Yaml for IaC.
Understanding of data governance (PII, tenant isolation) for AI workloads.
Benefits
Global hybrid work policy - We ask you to work 2 days a week from the office.
Growth and innovation - Every 6th sprint is reserved for planning and innovation.
Self-Direction - High degree of self-organization.
Inclusive and diverse company culture
Work-life balance – We believe that an equilibrium between professional responsibilities makes us all the best version of ourselves.
Empowerment – We believe that all voices are valuable and must be heard.
DevOps Systems Engineer supporting customer operations in Annapolis Junction, MD. Responsible for creating, sustaining, and troubleshooting complex operational data flows.
OpenShift Fresher assisting Cloud team in managing containerized applications using Red Hat OpenShift. Supporting CI/CD, deployment automation, and cloud - native application environments.
Site Reliability Engineer for Leidos ensuring reliability, performance, and scalability of complex distributed systems for the Navy - Marine Corps Intranet. Collaborating with teams to maintain and optimize network operations and services.
DevOps Engineer evolving banking infrastructure for a fintech company. Focusing on observability, incident response, and platform automation in a hybrid work setup.
Lead DevOps Engineer developing AI - powered supply chain intelligence solutions at S&P Global Mobility. Collaborate with data scientists and engineers to optimize operational infrastructure and continuous delivery processes.
Lead Site Reliability Engineer managing critical IT systems for S&P Dow Jones Indices. Focused on service availability, incident management, and developer collaboration to enhance operational reliability.
Senior SRE Engineer ensuring reliability and performance of AI products at Plaud. Designing scalable systems and leading incident response to improve operational maturity.
Senior DevOps Engineer managing development and deployment pipelines for AI products at Plaud. Optimize infrastructure, enhance productivity, and collaborate with cross - functional teams.
DevOps Engineer supporting big data solutions and AWS infrastructure deployment at Enlighten. Collaborating with teams to ensure reliability, scalability, and performance of cloud services.
Senior Reliability Engineer at Freeport - McMoRan focusing on reliability in copper mining operations. Leading continuous improvement efforts to enhance equipment efficiency and reduce failures.