Data Scientist delivering data science projects for Samba TV. Working on knowledge graphs, audience modeling, and mentoring junior team members.
Responsibilities
Own end-to-end delivery of significant data science projects — from problem scoping and approach design through to production deployment, with a focus on knowledge graph and identity solutions
Make sound, independently-reasoned decisions on methodology, model selection, and evaluation; document them clearly in technical solution documents covering problem statement, approach, metrics, and timeline
Lead solution design for your own initiatives; break down complex epics into well-scoped user stories with clear acceptance criteria, adopting DataOps and MLOps best practices throughout — experiment tracking, pipeline orchestration, model monitoring, and reproducibility
Build production-quality Python and PySpark code on Databricks — well-tested, documented, and reusable — and implement advanced ML and AI-powered workflows including entity resolution, probabilistic record linkage, embedding-based matching, semantic similarity, and LLM-augmented pipelines
Develop and maintain reusable tools, libraries, and documentation that improve team efficiency and technical standards; conduct code reviews with constructive, specific feedback that raises the bar
Mentor junior data scientists on technical execution, code quality, and career development; lead internal talks or workshops on knowledge graphs, identity, or ML topics
Collaborate cross-functionally with product, engineering, and operations — translate business requirements into technical specifications, partner with data engineering on scalable pipeline design, and participate in cross-functional design reviews and working groups
Requirements
Bachelor's degree required in Statistics, Data Science, Computer Science, Mathematics or a related quantitative field; Master's strongly preferred
3–5 years of hands-on data science experience with demonstrated ability to own and deliver complex, multi-sprint projects independently
Advanced Python with production-quality code, testing, and documentation; strong SQL and PySpark for billion-row datasets
Databricks workflows, Delta Lake, and job orchestration; working knowledge of cloud platforms (AWS or GCP)
Solid command of core ML — regression, classification, clustering, model evaluation, and experimental design — applied to complex, high-volume data
Proficiency with MLOps practices: experiment tracking, pipeline orchestration (Airflow), and reproducible model deployment
Exposure to modern AI methodologies: RAG systems, LLM-augmented models, vector databases, and semantic search
Strong communicator — able to translate technical work into clear documentation, user stories, and cross-functional conversations
Demonstrated ability to mentor junior data scientists and contribute to team standards
Data Scientist/ Machine Learning Engineer for a digital product company in Cologne. Focus on building solutions for recommendation systems and machine learning services.
Data Scientist at PulseRise Technologies driving data - informed decisions in crypto payments infrastructure. Collaborating with product leaders and CFO for insights on user journeys and transaction flows.
Senior Data Scientist at Navy Federal Credit Union providing data science and advanced analytics insights for decision - making. Collaborating with business units to optimize products and strategies.
Senior Data Scientist driving analytics and machine learning for b_labs at B.TECH. Collaborating with teams to innovate and support the company's goal of becoming a leading omni - channel platform in Egypt.
Data Scientist at SPG Resourcing analyzing data and shaping strategic direction. Collaborating across teams and developing data strategies to enhance products and services.
Junior Data Scientist supporting analytical projects to facilitate decision - making for companies in Brazil. Involves data preparation, cleaning, and preliminary analysis for effective insights generation.
Data Scientist in Klee Group's Data & AI team developing innovative AI solutions. Collaborating with clients and teams in a hybrid work environment with autonomy and support.
Senior Data Scientist at Keyrus focused on advanced analytics and machine learning. Responsibility includes data structuring, analysis, and predictive modeling in a hybrid work setting.
Data Manager role at DEDIENNE AEROSPACE focusing on Data Lake management and data governance. Involves collaboration with internal and external stakeholders for data quality and analytics.
Applied Data Scientist focused on operational analytics at Rowan Digital Infrastructure. Developing models and metrics to enhance reliability and decision making for data center solutions.