Data Scientist working on enhancing AI agent performance metrics and experiments. Analyzing data and collaborating with cross-functional teams to drive product improvements.
Responsibilities
Design and analyze experiments to measure agent improvements—from model changes to UX variations—with statistical rigor and practical tradeoffs.
Define success metrics that connect agent trace data (prompts, responses, code changes, execution outcomes) to user outcomes like successful deploys, retention, and revenue.
Build the semantic layer for agent data in partnership with data engineering—defining the tables, metrics, and models that enable self-serve analysis across the AI team.
Surface insights from trace analysis that identify failure modes, successful patterns, and opportunities to improve agent effectiveness.
Partner with AI engineering, product, and leadership to translate data into roadmap decisions; you'll have a seat at the table for critical agent strategy discussions.
Create dashboards and reporting that surface agent performance metrics (task completion, latency, quality scores, user satisfaction) for the AI team and executives.
Requirements
5+ years of experience in data science, analytics, or a quantitative role with a focus on product, growth, or experimentation.
Deep experimentation expertise: A/B testing, experiment design, power analysis, handling skewed data, interpreting results beyond p-values.
Strong SQL skills and experience designing data models for high-volume event data; experience with dbt or similar transformation tools.
Proficiency in Python and data science libraries (pandas, scipy, statsmodels, etc.).
Ability to translate ambiguous questions into structured analysis and communicate findings clearly to both technical and non-technical stakeholders.
Bias toward action: you ship insights that influence decisions, not just dashboards.
Data Science Intern joining Seagate's Product Development Group to leverage AI and ML technologies. Gaining hands - on experience in data analysis and model development on real - world projects.
Staff Data Scientist at Pagaleve developing advanced ML models for credit and fraud solutions. Lead technical expertise in a fintech setting, bridging data science and engineering teams.
Lead Data Scientist responsible for scalable AI/ML models and application backends at Cloudflare. Collaborate with teams to deliver features and operate data platforms in a hybrid environment.
Senior Enterprise Solutions Technical Lead at City of Toronto overseeing BI & Data services. Leading a team in data management and analytics aligned with organizational goals.
Data Scientist at Kpler enhancing Gas and Power teams to aggregate data for future forecasts. Collaborating with engineers and product teams for model deployment and performance enhancement.
Data Scientist developing ML models and analyzing various data sources at Taikonauten GmbH. Contributing to user - centered product and project development in R&D team.
Lead AI and Data Scientist shaping impactful AI solutions in Madrid's EMEA Digital Innovation Hub. Collaborating globally to apply advanced machine learning techniques and foster innovation.
Senior Associate at PwC focusing on data analytics to drive insights and guide client strategies. Involves advanced techniques and collaboration on AI and GenAI solutions.