text-to-speech AI Models

lucataco/higgs-audio-v2

Generate expressive, multilingual speech audio from text input. Produce zero-shot multi-speaker dialogues, emotional del...

📝 → 🔊 • text-to-speech • multilingual • 1.4K runs

🤖 Model 📝 → 🔊

haoheliu/audio-ldm

Generate audio from a text prompt. Produce sound effects, human speech, and music, with controls for clip duration and t...

📝 → 🔊 • text-to-audio • sound-effect-generation • music-generation • 38.2K runs

🤖 Model 📝 → 🔊

zsxkib/dia

Generate multi-speaker dialogue audio from text. Specify speakers with [S1], [S2], etc., and include non-verbal cues in...

📝 → 🔊 • text-to-speech • voice-cloning • multi-speaker-tts • 9.3K runs

🤖 Model 📝 → 🔊

camenduru/metavoice

Convert text to speech using MetaVoice-1B, a 1.2 billion parameter audio model trained on 100,000 hours of speech. Input...

📝 → 🔊 • text-to-speech • audio-synthesis • speech-generation • 12.5K runs

🤖 Model 📝 → 🔊

lucataco/neutts-air

Clone a voice from a short reference sample and synthesize new speech from text. Accepts text to speak, a 3–15s mono ref...

📝 → 🔊 • text-to-speech • voice-cloning • 168 runs

🤖 Model 📝 → 🔊

vladpolbennikov/kokoro-82m-all-voices

Generate speech audio from text. Select from preset voices and control speaking speed; automatically split long inputs f...

📝 → 🔊 • text-to-speech • 940 runs

🤖 Model 📝 → 🔊

pipeline-examples/stapledon-podcast

Generate imaginative text responses from a prompt and convert them to audio narration with customizable voice options. S...

📝 → 🔊 • text-to-speech • audio-narration • voice-customization • 2 runs

🤖 Model 📝 → 🔊

lee101/guided-text-to-speech

Generate speech from text using natural-language voice descriptions. Provide a text prompt and a free-form voice descrip...

📝 → 🔊 • text-to-speech • 431 runs

🤖 Model 📝 → 🔊

ttsds/pheme

Generate speech from text in the voice of a reference speaker. Takes a text prompt, a speaker reference audio clip, and...

📝 → 🔊 • text-to-speech • voice-cloning • 695 runs

🤖 Model 📝 → 🔊

ttsds/xtts_1

Generate speech from text using a cloned voice from a reference audio sample. Accept text and a speaker reference, then...

📝 → 🔊 • text-to-speech • voice-cloning • multilingual • 714 runs

🤖 Model 📝 → 🔊

ttsds/e2

Generate speech audio from text, cloning the voice from a provided reference recording. Provide the text to speak, a spe...

📝 → 🔊 • text-to-speech • voice-cloning • 270 runs

🤖 Model 📝 → 🔊

ttsds/f5

Generate speech from text in a cloned voice using a reference audio sample and its transcript. Accepts text plus speaker...

📝 → 🔊 • text-to-speech • voice-cloning • 2.7K runs

🤖 Model 📝 → 🔊

lucataco/indextts-2

Generate expressive speech from text with zero-shot voice cloning using a reference speaker audio input. Control emotion...

📝 → 🔊 • text-to-speech • voice-cloning • emotion-control • 1.4K runs

🤖 Model 📝 → 🔊

cjwbw/parler-tts

Generate speech audio from text with natural-language control over voice and acoustics. Provide a script and an optional...

📝 → 🔊 • text-to-speech • 2.7K runs

🤖 Model 📝 → 🔊

lucataco/neutts

Generate speech audio from text with instant voice cloning from a short reference clip. Provide a text prompt and 3–15 s...

📝 → 🔊 • text-to-speech • voice-cloning • 6 runs

🤖 Model 📝 → 🔊

alicewuv/kitten-tts

Convert text to speech on CPU with multiple built-in voices. Accepts text and outputs spoken audio, with controls for vo...

📝 → 🔊 • text-to-speech • cpu-only • 24 runs

🤖 Model 📝 → 🔊

kjjk10/kokoro-82m

Generate speech audio from text with selectable preset voices. Provide a text prompt and choose a voice (af, af_bella, a...

📝 → 🔊 • text-to-speech • 27.4K runs

🤖 Model 📝 → 🔊

ictnlp/llama-omni

Answer spoken queries with simultaneous text and speech output. Accepts a speech audio input and an optional instruction...

📝 → 🔊 • text-to-speech • text-generation • voice-assistant • 60.1K runs

🤖 Model 📝 → 🔊

lucataco/qwen2.5-omni-7b

Process text, images, audio, and video inputs to generate text and speech responses simultaneously. Features a novel Thi...

📝 → 🔊 • text-generation • image-to-text • video-to-text • 32.9K runs

🤖 Model 📝 → 🔊

minimax/speech-02-turbo

Generate speech audio from text with low latency for real-time applications. Choose from 300+ prebuilt voices or supply...

📝 → 🔊 • text-to-speech • multilingual-tts • real-time • 5.1M runs