Come up with data strategies on how to better leverage our labeled datasets to improve the performance of our auto-labeling pipeline (data sampling)
Develop and maintain data pipelines to monitor annotation quality metrics, model failure patterns, and dataset characteristics at scale using data warehouses and distributed processing tools
Collaborate with the ML team to diagnose model bottlenecks through statistical analysis and data slicing, identifying whether issues come from insufficient training samples, distribution shifts, or systematic biases in the annotation process
Participate in data-related activities within the team, across the company in collaboration with Toyota group, and enhancing team members' capability for data science
Requirements
2+ years experience in following industries or research areas:
・AD/ADAS
・Robotics
・Computer Vision
2+ years experience in data science or related areas, including theoretical aspects of data science like machine learning (deep learning, statistical analysis, and mathematical modeling)
Experience writing software using 1) Python for data science (numpy, scipy, scikit, pandas), 2) database, and 3) cloud platform services (AWS, GCP, Azure)
Bachelor's degree in science or engineering
Business-level proficiency in English
NICE TO HAVES
Master's degree or Ph.D. in related field
5+ years experience in data science or related areas
Hands-on experience in the following:
・Experience with computer vision datasets and annotation pipelines, particularly for autonomous systems or multi-camera setups, with a track record of identifying and resolving data quality issues
・Familiarity with active learning strategies and uncertainty quantification techniques to prioritize which samples need human review or re-annotation
・Proficiency with data visualization tools and statistical methods for large-scale dataset analysis, enabling quick identification of distribution shifts, labeling inconsistencies, or underrepresented scenarios in the training data
・Strong Python programming skills with experience using Git for version control and collaborative development in a team environment
・Hands-on experience with data infrastructure tools such as data warehouses, dbt for data transformation, and Spark for large-scale data processing
Business-level proficiency in Japanese (especially, smooth reading and listening)
Benefits
Competitive Salary - Based on experience
Work Hours - Flexible working time
Paid Holiday - 20 days per year (prorated)
Sick Leave - 6 days per year (prorated)
Holiday - Sat & Sun, Japanese National Holidays, and other days defined by our company
Japanese Social Insurance - Health Insurance, Pension, Workers’ Comp, and Unemployment Insurance, Long-term care insurance
Housing Allowance
Retirement Benefits
Rental Cars Support
In-house Training Program (software study/language study)
Senior Product Analyst collaborating with teams to develop innovative products at Agência Estado. Key responsibilities include data analysis, market research, and facilitating product management processes.
Senior/Staff Data Scientist developing AI for commerce in the Middle East. Architecting systems for merchant and customer AI assistants and content generation.
Data Scientist leveraging statistical methods and machine learning techniques at FUCHS. Focus on data analysis, modeling, and collaboration for data - driven solutions.
Data Science Intern leveraging AI and ML technologies for product development at Seagate. Hands - on experience with data analysis, model development, and actionable insights generation.
Analyst within Credit Risk Management team identifying credit segmentation opportunities using statistical methods. Collaborating with teams to enhance credit decision process and policies.
Data Manager managing and analyzing company data at Amoddex, a consultancy for IT transformation projects. Ensuring data integrity and supporting strategic decision - making in a collaborative environment.
Data Scientist at Capital One on the LLM Customization Team utilizing the latest in computing and machine learning technologies. Collaborating with data scientists and engineers to deliver AI powered products.
Lead Full Stack Data Scientist at Tilt, building the intelligence layer for data - based decisions. Driving data science strategy and analytics to enhance product and growth insights.
Data Scientist focusing on Generative AI applications and engineering problem - solving at Ford. Collaborating with cross - functional teams to innovate and improve technology solutions in the automotive sector.
AI Engineer/Data Scientist in Ford's Global Data Insights & Analytics team. Developing advanced AI/ML solutions and collaborating on cloud - native data products.