Principal DevOps Engineer for AI Infrastructure Engineering at Northrop Grumman, designing scalable systems and collaborating with cross-functional teams.
Responsibilities
Design and implement systems capable of scaling to meet growing data and computation demands in AI applications.
Implement AI Infrastructure that is compliant with relevant regulations and security standards, implementing necessary controls
Provide metrics to evaluate the performance and reliability of AI infrastructure and implement improvements based on findings.
Track advancements in AI technologies and infrastructure trends and evaluate applicability to the organization’s strategy.
Work alongside data scientists, software engineers, and i2 functional teams to provide infrastructure support for new and existing systems.
Requirements
Must have 5 years of relevant experience with a Bachelors degree OR 3 Years of relevant experience with Masters degree OR 9 years of relevant experience in lieu of the degree requirement
Must have a minimum of 3 years experience implementing the back-end infrastructure for COTS or SaaS platforms
Must have at minimum, 2 years of experience working with cloud (AWS or Azure) infrastructure and services
Must have experience with DevOps practices and tools for automation (ansible, ArgoCD), CI/CD, and configuration management
Must have hands-on experience with relevant programming languages (e.g., Python, Java, Powershell/Bash)
Must have experience managing Windows and Linux server environments
Must have basic knowledge of database technologies (e.g., SQL, NoSQL) and distributed computing systems
Must have exceptional verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders.
Benefits
health insurance coverage
life and disability insurance
savings plan
Company paid holidays
paid time off (PTO) for vacation and/or personal business
DevSecOps Engineer architecting CI/CD framework services for Truist, enhancing the flow of business value through DevSecOps practices. Building and maintaining automation for software delivery and operations.
Application Security Manager at Evertec, handling security strategy and implementation in financial tech. Leading efforts in Application Security, DevSecOps, and compliance with financial regulations.
Databricks Senior DevOps Engineer designing and operating platforms on AWS and Databricks for Financial Crime. Focused on platform infrastructure, governance, security, and operations.
Site Reliability Engineer at Assecor, focusing on SLIs, SLOs, and incident management. Enhancing performance and reliability through observability and automation in a hybrid work environment.
DevOps Architect at Ascensus, responsible for technical direction and oversight for application engineering practices across scrum teams. Promotes DevOps culture and innovative solutions.
Cloud Site Reliability Engineer ensuring scalability, performance, and reliability of cloud infrastructure deployed in Woven City. Working with product owners and teams for innovative solutions.
Senior DevOps Engineer supporting enterprise - grade Kubernetes infrastructure and CI/CD automation for U.S. Army projects. Engaging in critical system designs and automation processes with a focus on cloud - based platforms.
Reliability Engineer focusing on mechanical systems in a long - standing Australian FMCG company. Ensure ongoing reliability improvements and support plant operations for iconic cereal production.
Software Engineer 2 developing full - stack solutions for U.S. Bank. Collaborating with teams to design and maintain best in class software experiences.
Principal Software Engineer at FIS driving reliability and performance in fintech environments. Collaborating across teams for high - scale, high - reliability solutions in the finance sector.