Data Scientist at Stefanini shaping LLM customization via data pipelines and sources. Engaging in data structuring, quality assurance, and efficient storage practices.
Responsibilities
Design and implement data pipelines to support the LLM customization process
Collect, process, and structure diverse data sources
Develop scripts and processes for extracting structured and unstructured data
Implement transformations to convert raw data into formats suitable for training
Ensure the quality, consistency, and relevance of the data used for training
Create mechanisms for validation and testing of datasets
Develop processes for data enrichment
Implement efficient storage for data and training results
Configure data integration between the trained model and the Elastic platform
Document data architecture, flows, and transformations
Implement data versioning and traceability practices
Optimize data flow for model training iterations
Ensure security and compliance in the handling of data used
Requirements
Additional courses in natural language processing or data preparation for ML (desirable)
Practical knowledge of the Elastic Stack platform (Elasticsearch, Logstash, Kibana) | Level: Advanced (Required)
Experience preparing datasets for training language models | Level: Advanced (Required)
Experience with extraction, transformation, and loading (ETL) of unstructured data | Level: Advanced (Required)
Benefits
Meal allowance or food voucher
Discounts on courses, universities, and language schools
Stefanini Academy — a platform with free, up-to-date online courses and certificates
Mentoring
Benefits club for consultations and medical exams
Health insurance
Dental insurance
Employee discounts and benefits at top establishments
Data Scientist developing advanced analytical solutions for financial challenges in Berlin. Utilizing Python, SQL, and strong analytical skills to derive insights from data.
RWD Data Scientist at Elevance Health supporting strategic decision making and conducting commercial analytics. Involves developing predictive models and reporting solutions with a focus on healthcare datasets.
Data Scientist III creating analytical solutions for business partners at M&T Bank. Building models and conducting advanced data analysis to generate insights and solutions for business enhancement.
Senior Data Scientist leveraging analytics to enhance reliability engineering and asset management at Boeing. Collaborating across teams to develop predictive analytics and improve systems.
Senior Financial Analyst at Kajabi partnering with Finance and key leaders for strategic planning and financial insights. Analyzing metrics for margin improvement in a fast - paced, data - driven culture.
Data Scientist at Match Group developing AI prototypes and predictive models for enhancing user experiences. Collaborating across teams to extract insights and drive product impact in a hybrid work environment.
Senior Director of Risk Data Science at PayPal shaping fraud prevention capabilities. Leading a team of data scientists to drive AI and ML initiatives.
Data Scientist delivering high - impact solutions by collaborating in fast - paced environments. Responsible for modeling, evaluating outcomes, and providing technical delivery aligned with business objectives.
Data Scientist at Octopus Energy Trading reshaping energy trading for a sustainable future. Using data pipelines and models for GB intraday power market insights.