Hybrid Data Scientist, Evals

Posted last week

Apply now

About the role

  • Data Scientist designing evaluation metrics and pipelines to enhance answer quality for Perplexity's AI products. Collaborating in a high-impact team using advanced machine learning methodologies.

Responsibilities

  • Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness
  • Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality
  • Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices
  • Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements
  • Operate within a small, high-impact team where your evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality

Requirements

  • PhD or MS in a technical field or equivalent experience
  • 4+ years of experience in data science or machine learning
  • Strong proficiency in Python and SQL (expected to write production-grade code)
  • Experience building within a modern cloud data stack, specifically AWS and Databricks
  • Comfortable with agentic coding workflows and using AI-assisted development tools to iterate faster
  • 1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups (preferred)
  • Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale (preferred)
  • A strong research background, with experience applying research methods to real-world ML problems (preferred)
  • Experience defining evaluation metrics (e.g., factual consistency, hallucination rate, retrieval precision) and building ground truth datasets (preferred)

Benefits

  • U.S. Benefits
  • Full-time U.S. employees enjoy a comprehensive benefits program including equity, health, dental, vision, retirement, fitness, commuter and dependent care accounts, and more.
  • International Benefits
  • Full-time employees outside the U.S. enjoy a comprehensive benefits program tailored to their region of residence.

Job title

Data Scientist, Evals

Job type

Experience level

Mid levelSenior

Salary

$210,000 - $385,000 per year

Degree requirement

Postgraduate Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job