Hybrid Senior Researcher – Text to Speech

Posted 5 hours ago

Apply now

About the role

  • Senior Researcher developing advanced Text-to-Speech models at cutting-edge data science team. Leading research on naturalness, expressiveness, and low-latency speech synthesis.

Responsibilities

  • Lead research on Text-to-Speech models focused on naturalness, expressiveness, latency, and robustness
  • Design and train TTS systems for real-world voices across accents, languages, and speaking styles
  • Improve streaming and low-latency speech synthesis pipelines
  • Experiment with architectures, loss functions, and data strategies (multi-speaker training, style modeling, distillation, data augmentation)
  • Translate research ideas into production-ready TTS systems
  • Collaborate closely with infra, product, and voice engineering teams

Requirements

  • 3–6 years of specialized experience in speech through academia or industry
  • Strong background in Text-to-Speech / speech generation research
  • Hands-on experience with deep learning frameworks (PyTorch preferred)
  • Experience with real-time or low-latency TTS systems
  • Familiarity with modern TTS architectures (Tacotron-style, FastSpeech, VITS, diffusion-based, neural vocoders)
  • Ability to think end-to-end: data → model → inference → deployment
  • Prior work in multilingual, expressive, or accented speech synthesis is a strong plus
  • Publications in top speech / ML conferences
  • Experience deploying TTS models in real-time production
  • Exposure to conversational AI or voice agents

Benefits

  • We pay top dollar for the best candidates.

Job title

Senior Researcher – Text to Speech

Job type

Experience level

Senior

Salary

$200,000 - $300,000 per year

Degree requirement

Postgraduate Degree

Tech skills

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job