Lead Data Scientist developing digital twins and generative AI predictions for Caterpillar's digital applications. Collaborating across teams to create strong, automated diagnostic workflows and real-time monitoring solutions.
Responsibilities
Algorithm Development & Modeling
Anomaly Detection: Design and implement GPU-accelerated machine learning models (e.g., XGBoost, autoencoders, and GANs) to identify irregular patterns in high-frequency sensor data.
Digital Twin Engineering: Partner with engineering teams to develop onboard digital twins using NVIDIA architecture to simulate, predict, and optimize the performance of heavy machinery
Optimization: Profile and tune deep learning algorithms for maximum efficiency on NVIDIA GPU architectures, ensuring high throughput and low latency for real-time monitoring.
Testing onboard Architecture & Integration
Edge Deployment: Adapt and test algorithms for onboard architecture, leveraging tools like NVIDIA Jetson and real-time edge processing on Cat equipment.
Hardware-Software Co-Design: Collaborate with hardware / simulation engineers to ensure algorithm compatibility with next-generation processors and specialized onboard compute modules.
Simulation-Based Training: Use high-fidelity digital twins to simulate rare failure scenarios, ensuring the GenAI assistant provides accurate troubleshooting steps for edge-case mechanical issues.
GenAI Algorithm Automated Diagnostic Workflows: Develop Generative AI agents that synthesize telematics data to generate prioritized repairs for identified machine faults.
Unified Data Orchestration: Integrate multi-modal outputs from condition monitoring analytics & asset life history to create a machine-specific context for AI assistant.
Requirements
Typically, a Bachelors, Masters, or PhD degree in Applied Statistics, Data Science, Business Analytics, Predictive Analytics, Business Intelligence & Analytics, Mathematics, Computer Science, Engineering (Aerospace, Electrical, Mechanical, Computer, Industrial, Agricultural, etc.), or equivalent technical degree
Extensive experience applying Python (NumPy, SciPy, pandas, etc.) programming to solve business challenges.
Extensive experience with advanced data analysis, machine learning such as clustering, Log regressions, neural nets, and statistical methods such as statistical process control, etc. (typically 8+ years)
Experience in practical applications of onboard architecture / software (e.g. mini projects using Raspberry Pi or any other architecture is a bonus)
Working experience with heavy equipment engineering or data analysis.
Working knowledge with cloud technologies (AWS, Azure, Google Cloud, etc.)
Advanced experience with version control / repositories such as GitHub
Experience operating in an Agile environment
Must demonstrate strong initiative, interpersonal skills, and the ability to communicate effectively.
Benefits
Medical, dental, and vision benefits*
Paid time off plan (Vacation, Holidays, Volunteer, etc.)*
Senior Data Scientist at Analytic Partners designing advanced data science and AI solutions. Collaborating with teams to drive innovations and impact on products and clients.
Data Scientist supporting management consultants conduct data analysis and providing strategic decision - making insights. Involves data processing, model development, and collaborative problem - solving with clients.
Data Scientist analyzing large datasets to discover trends and supporting business stakeholders with data - driven insights. Designing machine learning models and presenting information through data visualization.
Senior Data Scientist designing generative AI applications at Roche, leveraging extensive expertise in AI and business applications. Collaborating with teams and influencing technical priorities in a dynamic environment.
Data Scientist building extensible multi - agent infrastructure at Roche. Focused on Generative AI solutions transforming healthcare and biotech operations.
Data Scientist executing complex data science projects in Pharma R&D at Roche. Leading AI initiatives and data integration efforts to drive strategic decision - making.
Senior Data Scientist building machine learning solutions for Kempower's EV charging software. Collaborating across teams and mentoring junior colleagues in a hybrid work environment.
Data Science Intern supporting AI/ML initiatives within Foundation GEOINT. Working on computer vision and geospatial data analysis for government customers.
Data Scientist helping drive customer success and engagement using data - driven insights at OpenAI. Collaborating with various business units to optimize performance and foster growth.
Senior Data Scientist developing NLP and data science solutions for fast - evolving markets at LSEG. Collaborating with Subject Matter Experts to ensure production - ready, customer - focused outcomes.