Lead Software Engineer within Wells Fargo Digital Technology, focusing on GPU infrastructure and GenAI model inferencing. Building high performance, reliable, and secure model serving systems.
Responsibilities
Lead complex technology initiatives including those that are companywide with broad impact
Act as a key participant in developing standards and companywide best practices for engineering complex and large scale technology solutions for technology engineering disciplines
Design, code, test, debug, and document for projects and programs
Review and analyze complex, large-scale technology solutions for tactical and strategic business objectives, enterprise technological environment, and technical challenges that require in-depth evaluation of multiple factors, including intangibles or unprecedented technical factors
Make decisions in developing standard and companywide best practices for engineering and technology solutions requiring understanding of industry best practices and new technologies, influencing and leading technology team to meet deliverables and drive new initiatives
Collaborate and consult with key technical experts, senior technology team, and external industry groups to resolve complex technical issues and achieve goals
Lead projects, teams, or serve as a peer mentor
Engineer GPU clusters and node pools; configure NVLink/NVSwitch, NVIDIA GPU Operator, MIG profiles, container runtime, and kernel/driver baselines for high-throughput LLM/SLM workloads.
Design and implement OpenAI-compatible APIs (Responses, Interactions) behind the AI Gateway: define OpenAPI contracts, authN/Z (OAuth2/mTLS), rate limits/quotas, SLAs, versioning/deprecation, and SDK generation.
Build and support MCP servers and tool adapters; manage agent/tool identity and capability metadata; integrate with agent registries and execution flows.
Develop Agentic AI capabilities (tools/agents/workflows) including disaggregated prefill/decode patterns; contribute to runbooks, guardrails, and safe tool usage.
Build UI surfaces (developer/ops consoles) for endpoint onboarding, prompt testing, evaluations, observability dashboards, and incident response workflows.
Apply prompt engineering and evaluation best practices; create golden test suites, regression harnesses, and measurable SLO-aligned criteria for production promotion.
Requirements
5+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
5+ years of experience in Python for backend/services development, packaging, instrumentation, and automation
5+ years of experience building modern web UI for developer/ops workflows, including dashboards, wizards, and prompt/eval tooling, with strong testing and accessibility practices
1+ years of experience building MCP servers, tool adapters, and agent workflows, with an understanding of agent identity, permissions, and governance metadata
2+ years of experience in GenAI engineering, including LLM/SLM operations, fine-tuning/evaluation, per-model performance recipes, and prompt engineering and evaluation harnesses
1+ years of experience with LLM API exposure, including AI Gateway — OAuth2/mTLS, rate limits/quotas, OpenAPI/SDKs, SLAs, versioning/deprecation, and OpenAI-compatible API design for responses and interactions
1+ years of experience with serving large language models (LLM/SLM), including vLLM, Triton, TensorRT-LLM/MII, KV cache strategies, FP8/INT4 AWQ/GPTQ, and certified disaggregated prefill/decode
1+ years of experience with orchestration tools for GPU workload management, such as Run:AI (Collections/queues, quotas, preemption, fair share), OpenShift AI (RHOAI), and OCP/GKE administration
1+ years of experience with GPU Inference Layer, including NVIDIA and CUDA technologies such as CUDA, cuDNN, NVLink/NVSwitch, MIG, NIXL, GPU profiling, and H100/H200 performance tuning
Benefits
No visa sponsorship available
No relocation assistance for this position
Job title
Lead Software Engineer – Gen AI Inferencing Services, Agentic AI
Principal Engineer for GCP at Barclays driving secure platform design and engaging with multiple stakeholders. Providing technical leadership and defining GCP product strategy in financial services.
Senior Software Engineer designing and building the GPU - based GenAI platform for Wells Fargo. Leading technical initiatives and collaborating with teams to resolve AI challenges.
Lead Engineer driving the evolution of GenAI platform at Wells Fargo, focusing on AI Gateway and the LLM API layer. Ensuring secure access to AI backends and improving developer productivity.
Engineering Manager responsible for leading modernization of digital platforms at Wells Fargo. Managing high - quality capabilities and ensuring adherence to technical strategies in a dynamic environment.
Software Engineer driving modernization of IAM data platforms and applications at Wells Fargo. Assisting in transforming legacy SQL environments into GCP Lakehouse architecture.
Senior Engineer optimizing daily performance of ammonia refrigeration systems at Conagra Brands. Ensuring effective operation and compliance while providing technical guidance across manufacturing network.
Principal Full Stack Engineer at Fidelity Labs developing modular applications in a collaborative team environment. Focus on building innovative SaaS solutions for the Charitable sector.
Senior Software Developer at Andromeda Systems working on web - based applications using .Net, Angular, and SQL technologies. Collaborating with teams to innovate and deliver solutions for asset management.
Senior Full - Stack Software Engineer collaborating with development team to deliver high - impact web applications at Fidelity. Focused on robust solutions ensuring quality across the product lifecycle.
Technical Leader managing mechanical engineering activities for BWRX - 300 project at GE Vernova. Leading analyses, evaluations, and providing mentorship within engineering team.