lucataco/neutts-air 📝🖼️ → 🖼️

▶️ 147 runs 📅 Oct 2025 ⚙️ Cog 0.16.8-dev+g171b1d52 🔗 GitHub ⚖️ License
text-to-speech voice-cloning

About

super-realistic, TTS speech language model with instant voice cloning

Example Output

Output

Example output

Performance Metrics

19.27s Prediction Time
367.50s Total Time
All Input Parameters
{
  "text": "My name is Dave, and um, I'm from London.",
  "ref_text": "So I'm live on radio. And I say, well, my dear friend James here clearly, and the whole room just froze. Turns out I'd completely misspoken and mentioned our other friend.",
  "ref_audio": "https://replicate.delivery/pbxt/Nqm1eHrhRE8RIR9uAIlwDjlQY2yBswQzlkh1myYC67Ixycag/dave.wav"
}
Input Parameters
text Type: stringDefault: My name is Dave, and um, I'm from London.
The text to synthesize as speech
ref_text Type: stringDefault:
Transcript of the reference audio (what is being said in the audio file)
ref_audio (required) Type: string
Reference audio file (.wav) for voice cloning (3-15 seconds, mono, 16-44kHz)
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Encoding reference audio: /tmp/tmpby22gq5vdave.wav
Generating speech for: My name is Dave, and um, I'm from London....
Speech generated successfully: /tmp/tmpfh953fjx.wav
Version Details
Version ID
607e7b40e5fbaa97b828a6c71848c6b6ffdcf26c04feb6da0b1a411dfe9a7978
Version Created
October 8, 2025
Run on Replicate →