Senior Software Engineer, LLM Inferencing, AI Gateway at Wells Fargo | Hybrid Hired

About the role

Senior Software Engineer designing and building the GPU-based GenAI platform for Wells Fargo. Leading technical initiatives and collaborating with teams to resolve AI challenges.

Responsibilities

Lead complex Generative AI initiatives and deliverables within technical domain environments
Contribute to large scale planning of strategies
Design, code, test, debug, and document for projects and programs associated with technology domain, including upgrades and deployments
Review moderately complex technical challenges that require an in-depth evaluation of technologies and procedures
Resolve moderately complex issues and lead a team to meet existing client needs or potential new clients needs while leveraging solid understanding of the function, policies, procedures, or compliance requirements
Collaborate and consult with peers, colleagues, and mid-level managers to resolve technical challenges and achieve goals
Lead projects and act as an escalation point, provide guidance and direction to less experienced staff
Engineer GPUs clusters and node pools; configure NVLink/NVSwitch, NVIDIA GPU Operator, MIG profiles, container runtime, and kernel/driver baselines for high‑throughput LLM/SLM workloads.

Requirements

4+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
1+ years of experience with GPU Inference including NVIDIA CUDA, cuDNN, NVLink/NVSwitch, MIG, NIXL, GPU profiling, and performance tuning on H100/H200 architectures
1+ years of experience with GPU orchestration platforms, such as RunAI (collections, queues, quotas, preemption, fair-share scheduling), OpenShift AI (RHOAI), and cluster administration on OCP or GKE
1+ years of experience with LLM/SLM serving frameworks, including vLLM, Triton, TensorRT‑LLM/MII, KV‑cache optimization strategies, and FP8/INT4 quantization techniques (AWQ/GPTQ)
1+ years of experience working with LLM API gateways, including OAuth2/mTLS authentication, rate‑limiting and quota management, OpenAPI/SDK integration, SLAs, and versioning/deprecation practices
2+ years of experience in Generative AI engineering, including LLM/SLM operations, fine‑tuning, evaluation pipelines, and developing model‑specific performance optimization recipes
4+ years of experience in Python, including scripting, automation, and model/inference‑related development

Benefits

Health benefits
401(k) Plan
Paid time off
Disability benefits
Life insurance, critical illness insurance, and accident insurance
Parental leave
Critical caregiving leave
Discounts and savings
Commuter benefits
Tuition reimbursement
Scholarships for dependent children
Adoption reimbursement

Similar roles

Browse all Full Stack Engineer jobs

1 hour ago

AS

Staff Software Engineer – MPP Growth

Abnormal Security

Staff Software Engineer leading the APAC team within Abnormal AI's Multi - Product Platform. Focusing on customer experience and international expansion in cutting - edge security solutions.

Hybrid Role

Bangalore India Full Stack Engineer

2 hours ago

OP

Fullstack Developer – Mid/Senior

OpenCircle

Fullstack Developer creating APIs and modern interfaces at OpenCircle, a technology company focused on innovation. Working with both backend and frontend technologies in a collaborative environment.

Hybrid Role

São Paulo Brazil Full Stack Engineer

5 hours ago

SM

Principal Software Engineer – AI

Simple Machines

Principal Software Engineer designing and building LLM - powered systems for AI - driven solutions. Collaborating across teams to transform engineering problems into production - ready intelligent applications.

Hybrid Role

Sydney Australia Full Stack Engineer

6 hours ago

CO

Associate Software Engineer, Co-Op / Intern

CodeMettle

Associate Software Engineer Co - Op helping develop and maintain software products at CodeMettle. Collaborating with engineering teams on various projects and gaining practical experience in software development.

Hybrid Role

Atlanta United States Full Stack Engineer

$25 per hour

9 hours ago

ZI

AI Software Engineer – Reliability

Ziosk

AI Software Engineer - Reliability diagnosing and resolving complex production issues for restaurant technology solutions. Leveraging AI tools to enhance system resilience and efficiency in a hybrid work environment.

Hybrid Role

Plano United States Full Stack Engineer

11 hours ago

BV

Senior Civil Engineering Technician

Black & Veatch

Senior Engineering Technician collaborating with teams on civil infrastructure projects for Battery Energy Storage Systems and Clean Transportation at Black & Veatch.

Onsite Role

Overland Park United States Full Stack Engineer

13 hours ago

MA

Lead Engineer – Testing und Dokumentation

MOTOR Ai

Lead Engineer responsible for testing and documentation of Level - 4 autonomous systems at MOTOR Ai. Collaborating with cross - functional teams to ensure effective system validation and integration.

Hybrid Role

Berlin Germany Full Stack Engineer

19 hours ago

ES

Senior Software Architect

Euna Solutions

Senior Software Architect designing scalable SaaS applications for Euna Solutions. Leading architectural standards, mentoring teams, and ensuring alignment with business goals in a hybrid environment.

Hybrid Role

Oakville Canada Full Stack Engineer

CA$127,600 - CA$159,900 per year

21 hours ago

FI

Principal Full Stack Engineer

Fidelity Investments

Principal Full Stack Engineer at Fidelity collaborating on high - quality scalable systems solutions. Leading technical strategies and development efforts in financial services technology environment.

Hybrid Role

Westlake United States Full Stack Engineer

21 hours ago

FI

Senior Software Engineer

Fidelity Investments

Senior Software Engineer developing automation tools within Fidelity’s SkillBridge Program. Focusing on risk management in the Cloud and DevOps environments for corporate readiness.

Hybrid Role

Westlake United States Full Stack Engineer