Data Science/Gen AI Specialist working with NLP and LLM applications for automotive solutions. Responsibilities include design, deployment, and showcasing AI applications.
Responsibilities
Design NLP/LLM/GenAI applications/products by following robust coding practices
Explore SoTA models/techniques so that they can be applied for automotive industry usecases
Conduct ML experiments to train/infer models; if need be, build models that abide by memory & latency restrictions
Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools.
Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash, Plotly, Streamlit, etc.)
Converge multibots into super apps using LLMs with multimodalities
Build modular AI/ML products that could be consumed at scale.
Requirements
Bachelor’s or master’s degree in computer science, Engineering, Maths or Science
Strong communication skills and do excellent teamwork through Git/slack/email/call with multiple team members across geographies.
Experience in LLM models like PaLM, GPT4, Mistral (open-source models)
Work through the complete lifecycle of Gen AI model development, from training and testing to deployment and performance monitoring.
Developing and maintaining AI pipelines with multimodalities like text, image, audio etc.
Experience in developing Image generation/translation tools using any of the latent diffusion models like stable diffusion, Instruct pix2pix.
High familiarity in the use of DL theory/practices in NLP applications
Comfort level to code in Huggingface, LangChain, Chainlit, Tensorflow and/or Pytorch, Scikit-learn, Numpy and Pandas
Knowledge in fundamental text data processing (like use of regex, token/word analysis, spelling correction/noise reduction in text, etc.)
Familiarity in the use of Docker tools, pipenv/conda/poetry env
Good working knowledge on other open-source packages to benchmark and derive summary.
Experience in using GPU/CPU of cloud and on-prem infrastructures.
Familiarity with orchestration tools such as airflow, Kubeflow
Good UI skills to visualize and build better applications using Gradio, Dash, Streamlit, React, Django, etc.
Skillsets to perform distributed computing through Spark, Dask, RapidsAI or RapidscuDF
Experience in Elastic Search and Apache Solr is a plus, vector databases.
Data Lead to design new data architecture and migrate legacy systems at fast - growing company. Improving data performance, governance, and costs in GCP BigQuery.
Performance Data Manager managing pavement data and providing technical support for the Illinois Department of Transportation. Compiling and analyzing data for transportation infrastructure projects.
Senior Data Scientist developing advanced analytical solutions to enhance customer data transformation at CI&T. Collaborating on strategic projects requiring strong analytical skills and cloud experience.
Data Scientist developing predictive models for vehicle budgeting solutions at Cilia Tecnologia. Collaborating with engineering team to implement models and monitor production performance.
Data Science Intern joining Seagate's Product Development Group to leverage AI and ML technologies. Gaining hands - on experience in data analysis and model development on real - world projects.
Staff Data Scientist at Pagaleve developing advanced ML models for credit and fraud solutions. Lead technical expertise in a fintech setting, bridging data science and engineering teams.
Lead Data Scientist responsible for scalable AI/ML models and application backends at Cloudflare. Collaborate with teams to deliver features and operate data platforms in a hybrid environment.
Senior Enterprise Solutions Technical Lead at City of Toronto overseeing BI & Data services. Leading a team in data management and analytics aligned with organizational goals.
Data Scientist at Kpler enhancing Gas and Power teams to aggregate data for future forecasts. Collaborating with engineers and product teams for model deployment and performance enhancement.