Hybrid Lead Data Scientist – Gen AI, Digital Twin

Posted 9 hours ago

Apply now

About the role

  • Lead Data Scientist developing digital twins and generative AI predictions for Caterpillar's digital applications. Collaborating across teams to create strong, automated diagnostic workflows and real-time monitoring solutions.

Responsibilities

  • Algorithm Development & Modeling
  • Anomaly Detection: Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs) to identify irregular patterns in high-frequency sensor data.
  • Digital Twin Engineering: Partner with engineering teams to develop onboard digital twins using NVIDIA architecture to simulate, predict, and optimize the performance of heavy machinery
  • Optimization: Profile and tune deep learning algorithms for maximum efficiency on NVIDIA GPU architectures, ensuring high throughput and low latency for real-time monitoring.
  • Testing onboard Architecture & Integration
  • Edge Deployment: Adapt and test algorithms for onboard architecture, leveraging tools like NVIDIA Jetson and real-time edge processing on Cat equipment.
  • Hardware-Software Co-Design: Collaborate with hardware / simulation engineers to ensure algorithm compatibility with next-generation processors and specialized onboard compute modules.
  • Simulation-Based Training: Use high-fidelity digital twins to simulate rare failure scenarios, ensuring the GenAI assistant provides accurate troubleshooting steps for edge-case mechanical issues.
  • GenAI Algorithm Automated Diagnostic Workflows: Develop Generative AI agents that synthesize telematics data to generate prioritized repairs for identified machine faults.
  • Unified Data Orchestration: Integrate multi-modal outputs from condition monitoring analytics & asset life history to create a machine-specific context for AI assistant.

Requirements

  • Typically, a Bachelors, Masters, or PhD degree in Applied Statistics, Data Science, Business Analytics, Predictive Analytics, Business Intelligence & Analytics, Mathematics, Computer Science, Engineering (Aerospace, Electrical, Mechanical, Computer, Industrial, Agricultural, etc.), or equivalent technical degree
  • Extensive experience applying Python (NumPy, SciPy, pandas, etc.) programming to solve business challenges.
  • Extensive experience with advanced data analysis, machine learning such as clustering, Log regressions, neural nets, and statistical methods such as statistical process control, etc. (typically 8+ years)
  • Experience in practical applications of onboard architecture / software (e.g. mini projects using Raspberry Pi or any other architecture is a bonus)
  • Working experience with heavy equipment engineering or data analysis.
  • Working knowledge with cloud technologies (AWS, Azure, Google Cloud, etc.)
  • Advanced experience with version control / repositories such as GitHub
  • Experience operating in an Agile environment
  • Must demonstrate strong initiative, interpersonal skills, and the ability to communicate effectively.

Benefits

  • Medical, dental, and vision benefits*
  • Paid time off plan (Vacation, Holidays, Volunteer, etc.)*
  • 401(k) savings plans*
  • Health Savings Account (HSA)*
  • Flexible Spending Accounts (FSAs)*
  • Health Lifestyle Programs*
  • Employee Assistance Program*
  • Voluntary Benefits and Employee Discounts*
  • Career Development*
  • Incentive bonus*
  • Disability benefits
  • Life Insurance
  • Parental leave
  • Adoption benefits
  • Tuition Reimbursement

Job title

Lead Data Scientist – Gen AI, Digital Twin

Job type

Experience level

Senior

Salary

$128,470 - $208,770 per year

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job