text-to-speech AI Models - Page 7

elevenlabs/v2-multilingual

Generate multilingual text-to-speech audio from text input. Convert up to ~10,000 characters per request into natural, e...

📝 → 🔊 • text-to-speech • multilingual • 1.2K runs

🤖 Model 📝 → 🔊

elevenlabs/v3

Generate expressive speech audio from text input. Choose from preset voices (e.g., Rachel, Drew, Paul, Aria, Domi, Dave,...

📝 → 🔊 • text-to-speech • multilingual • 44 runs

🤖 Model 📝 → 🔊

minimax/speech-2.6-hd

Generate multilingual speech audio from text input. Control voice selection (300+ system voices or a custom cloned voice...

📝 → 🔊 • text-to-speech • multilingual-tts • voiceover • 2.4K runs

🤖 Model 📝 → 🔊

minimax/speech-2.6-turbo

Generate speech audio from text with low latency for real-time agents and interactive apps. Accept text input and output...

📝 → 🔊 • text-to-speech • multilingual-tts • low-latency • 2.0K runs

🤖 Model 📝 → 🔊

lucataco/interactiveomni-8b

Processes multiple inputs simultaneously including images, audio, text, and video to generate coherent text and speech r...

📝 → 🔊 • text-generation • image-to-text • video-to-text • 86 runs

🤖 Model 📝 → 🔊

elevenlabs/turbo-v2.5

Convert text to speech audio with low latency for real-time voice agents, chatbots, and interactive apps. Generate multi...

📝 → 🔊 • text-to-speech • multilingual • real-time • 26 runs

🤖 Model 📝 → 🔊

elevenlabs/flash-v2.5

Convert text to speech with ultra-low latency for real-time voice agents, chatbots, and interactive apps. Accepts text p...

📝 → 🔊 • text-to-speech • multilingual • real-time • 330 runs

🤖 Model 📝 → 🎥

avocado/podcast-clip-generator

Generate short podcast clips with a talking AI host from a text prompt. Provide a podcaster_prompt, choose a voice (Wise...

📝 → 🎥 • text-to-video-with-audio • text-to-speech • podcast • 23 runs

🤖 Model 📝 → 🔊

resemble-ai/chatterbox-turbo

Convert text to speech with low latency for voice agents, narration, and interactive applications. Accepts text (up to 5...

📝 → 🔊 • text-to-speech • voice-cloning • 15 runs

🤖 Model 📝 → 🔊

acappemin/deepaudio-v1

Generate speech and soundtracks from a video input. Condition speech on a provided transcript (text) and optionally a re...

📝 → 🔊 • video-to-audio • text-to-speech • 63 runs

🤖 Model 📝 → 🔊

douwantech/gpt-sovits-train

Trains and fine-tunes GPT-SoVITS voice models from custom audio or video datasets for high-quality voice cloning and tex...

📝 → 🔊 • voice-cloning • text-to-speech • 219 runs

🤖 Model 📝 → 🔊

qwen/qwen3-tts

Generate multilingual speech from text with preset voices, voice cloning, and voice design. Accept text plus optional la...

📝 → 🔊 • text-to-speech • voice-cloning • multilingual • 216.6K runs

🤖 Model 📝 → 🔊

qwen/qwen-tts

Generate speech from text with preset, cloned, or designed voices. Accept text as input and return spoken audio. Choose...

📝 → 🔊 • text-to-speech • voice-cloning • multilingual • 9 runs

🤖 Model 📝 → 🔊

thanhnew2001test/nghitts3

Synthesize Vietnamese speech from text input. Accepts Vietnamese text and returns spoken audio, with optional preprocess...

📝 → 🔊 • text-to-speech • vietnamese • 13 runs

🤖 Model 📝 → 🔊

voiser-ai/moss-tts

Convert text to speech with optional voice cloning from a reference audio sample. Accepts text and an optional speaker r...

📝 → 🔊 • text-to-speech • voice-cloning • 15 runs

🤖 Model 📝 → 🔊

inworld/tts-1.5-max

Converts text to natural, expressive speech with low latency under 200ms. Supports 15 languages including English, Chine...

📝 → 🔊 • text-to-speech • voice-cloning • 79.5K runs

🤖 Model 📝 → 🔊

inworld/tts-1.5-mini

Converts text to speech with ultra-fast ~120ms latency and support for 15 languages including English, Chinese, Japanese...

📝 → 🔊 • text-to-speech • voice-cloning • 30.2K runs

🤖 Model 📝 → 🔊

minimax/speech-2.8-hd

Convert text to natural-sounding speech. Generate high-fidelity audio from up to 10,000 characters with 17+ preset voice...

📝 → 🔊 • text-to-speech • voice-cloning • 2 runs

🤖 Model 📝 → 🔊

minimax/speech-2.8-turbo

Generate speech audio from text input. Control emotion (neutral, happy, sad, angry, fearful, disgusted, surprised, calm,...

📝 → 🔊 • text-to-speech • multilingual • 1 runs

🤖 Model 📝 → 🔊

ttsds/maskgct

Convert text to speech with zero-shot voice cloning from a reference audio sample. Provide target text and language (Eng...

📝 → 🔊 • text-to-speech • voice-cloning • 483 runs