lucataco/neutts-air 📝🖼️ → 🖼️

▶️ 478 runs 📅 Oct 2025 ⚙️ Cog 0.16.8-dev+g171b1d52 🔗 GitHub ⚖️ License

speech-style-transfer text-to-speech voice-cloning

Performance

19.3sTypical run time

~368sCold start (first call)

478Total runs

About

super-realistic, TTS speech language model with instant voice cloning

Example Output

Output

Performance Metrics

19.27s Prediction Time

367.50s Total Time

All Input Parameters

{
  "text": "My name is Dave, and um, I'm from London.",
  "ref_text": "So I'm live on radio. And I say, well, my dear friend James here clearly, and the whole room just froze. Turns out I'd completely misspoken and mentioned our other friend.",
  "ref_audio": "https://replicate.delivery/pbxt/Nqm1eHrhRE8RIR9uAIlwDjlQY2yBswQzlkh1myYC67Ixycag/dave.wav"
}

Input Parameters

text Type: stringDefault: My name is Dave, and um, I'm from London.: The text to synthesize as speech
ref_text Type: stringDefault:: Transcript of the reference audio (what is being said in the audio file)
ref_audio (required) Type: string: Reference audio file (.wav) for voice cloning (3-15 seconds, mono, 16-44kHz)

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Encoding reference audio: /tmp/tmpby22gq5vdave.wav
Generating speech for: My name is Dave, and um, I'm from London....
Speech generated successfully: /tmp/tmpfh953fjx.wav

Version Details

Version ID: 607e7b40e5fbaa97b828a6c71848c6b6ffdcf26c04feb6da0b1a411dfe9a7978
Version Created: October 8, 2025

Run on Replicate →