Senior Systems Engineer at NVIDIA focused on improving AI cluster resiliency and delivering AIOps solutions. Collaborating with team members to debug complex issues and enhance customer satisfaction.
Responsibilities
Bring together and understand internal and external customer requirements to improve AI cluster resiliency and design AIOps-based solutions that address these needs
Develop automated workflows for issue detection and root cause analysis and closely collaborate with operators to debug sophisticated, full-stack AI cluster problems
Deliver compelling technical presentations and lead hands-on demos or training
Handle evaluation deployments (POC/POV) and ensure smooth, reliable installations by staying engaged throughout the customer journey
Requirements
Bachelor of Science or equivalent experience
8+ years of networking experience in enterprise or service provider environments, with strong hands-on expertise in routing and switching
Proficient in scripting and automation using Python or similar languages, with strong Linux expertise
Proven experience working directly with customers to resolve issues and ensure success in Systems Engineer or SRE roles
Exceptional oral, written, and presentation skills for clearly communicating complex technical topics
Demonstrated ability to collaborate effectively across teams, partnering with operations, engineering, and product development
Benefits
Equity
Benefits
Job title
Senior Systems Engineer, Artificial Intelligence Operations
Windows Domain System Engineer managing Windows Server environments and supporting system performance at HII. Collaborating with virtualization engineers and providing Tier 2/3 technical support while ensuring security compliance.
IT Systems Analyst supporting the analysis, administration, and integration of systems at Truliant. Collaborating with business and IT teams to enhance workflows and system performance.
Staff Linux Systems Engineer designing and maintaining Linux - based software for cloud - driven networking solutions at Extreme Networks. Collaborating with cross - functional teams to ensure successful project execution.
Developing and maintaining SQL and PL/SQL systems for Polinutri, a Brazilian company focused on animal nutrition. Involves API integration and ERP support.
Systems Engineer I developing a variety of innovative Unmanned Systems for defense and commercial applications. Collaborating with design team and Controls Engineers to drive simulation models.
Business Systems Analyst providing technical knowledge and support in operational system issues. Collaborating with IT and business teams to optimize processes and resolve issues.
As a Systems Engineer at Capita, monitor and support IT systems and infrastructure for customers. Collaborate within a dynamic 24/7 team in Belfast, addressing incidents and ensuring uptime.
Supply Chain Systems Analyst driving process excellence and data integrity for plant - based meal delivery. Partnering with teams across operations to enhance efficiency and scalability.
Business Systems Analyst Consultant developing solutions for IT and business management at PNC. Partnering with business and technology stakeholders to align business requirements and functional specifications.
Graduate Systems Engineer focusing on avionics systems development for unmanned air systems at Callen - Lenz. Collaborating with cross - functional teams in an innovative technology environment.