Data Scientist creating scalable insights from unstructured data at AI safety company. Collaborating with engineering and research teams in a hybrid Paris location.
Responsibilities
Turn petabytes of unstructured text into a structured, explorable view (topics, clusters, segments, trends, anomalies): iterate from “unknown unknowns” to stable definitions we can track.
Build scalable representation pipelines: sampling strategies, preprocessing/normalization, embeddings at scale, indexing, and retrieval to make the corpus searchable and analyzable.
Use LLMs pragmatically: labeling/classification, weak supervision, data enrichment, summarization, and automated diagnostics of inbound volumes (with cost/quality controls).
Deliver insights that change decisions: translate findings into product and operational actions (what data we have, what’s missing, where quality breaks, what to prioritize next).
Ship self-serve analytics: datasets, data models, and lightweight tools/dashboards so the team can explore and answer questions without ad-hoc requests.
Partner closely with engineering/research: align pipelines with production constraints (latency/cost/privacy), and integrate outputs into workflows.
Requirements
Strong Python + SQL with an engineering mindset: you can build reliable pipelines, not just notebooks.
Solid applied NLP/ML experience on real-world text: embeddings, clustering, topic modeling, semantic search, classification; you understand failure modes and how to debug them.
Comfortable at scale: distributed processing, large-scale storage-querying, and performance-cost tradeoffs.
You know how to evaluate fuzzy problems: offline/online metrics, human-in-the-loop labelling, inter-annotator agreement, drift monitoring, and reproducibility.
Prior work with safety/moderation datasets, policy/rule systems, or high-volume logging/observability
Benefits
20 days of paid vacation
Work from Paris (hybrid) + relocation package
Best medical insurance in France
All the hardware, tools, and services you need
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez
Senior Analytics Engineer owning the analytics data platform for parenting technology startup. Requires deep SQL expertise and data pipeline experience in a hybrid role.
Senior Analytics Engineer at Higgsfield AI translating product and finance metrics into data models. Collaborating cross - functionally to ensure consistent, reliable data for decision - making.
Senior Data Analytics Developer for Krux, building data infrastructure for innovative SaaS solutions in mining industry. Collaborating with a self - motivated team to drive company growth.
Lead Risk Analytics Consultant at Wells Fargo focused on model governance and risk management strategies. Collaborate across teams to enhance system stability and mentor junior staff.
Analytics Engineering Lead responsible for building data products at Sanlam Fintech. Overseeing analytics engineering practice modernization and talent development within the organization.
Analytics Engineer foundational technical pillar for Analytics & Data Engineering at Skin + Me, transforming raw data into a strategic engine for growth. Reporting to the Director of Data and optimizing performance across business units.
Analytics Engineer transforming raw data into analytics - ready datasets for dashboards and data products. Building data models in Snowflake and developing LookML for analytics at RB Global.
Senior Analytics Engineer at RB Global designing domain models and leading KPI efforts. Optimizing data performance and mentoring analytics engineers in a hybrid work environment.
Analytics Engineer transforming data into actionable insights at Zopa. Collaborating with teams to build high - quality data products and support business growth.
Senior Analytics Engineer at Bauer Media Outdoor transforming data into trusted datasets for decision making. Design and maintain data models, ensuring analytical usability and quality across the business.