speech-style-transfer AI Models

x-lance/f5-tts

Synthesize speech from text in a cloned voice using a reference audio sample. Provide a text prompt and speaker referenc...

📝 → 🔊 • text-to-speech • voice-cloning • 32.1K runs

🤖 Model 📝 → 🔊

ttsds/fishspeech_1_2_sft

Clone a target voice and generate speech audio from text. Provide a short speaker reference audio and its transcript (te...

📝 → 🔊 • text-to-speech • voice-cloning • 253 runs

🤖 Model 📝 → 🔊

chenxwh/openvoice

Clone a voice from a short reference clip and generate speech from text. Accepts text and a reference audio sample; outp...

📝 → 🔊 • text-to-speech • voice-cloning • multilingual • 77.8K runs

🤖 Model 📝 → 🔊

ttsds/f5

Generate speech from text in a cloned voice using a reference audio sample and its transcript. Accepts text plus speaker...

📝 → 🔊 • text-to-speech • voice-cloning • 2.7K runs

🤖 Model 📝 → 🔊

ttsds/hierspeechpp_1_1

Generate speech from text conditioned on a reference voice sample. Input text and a speaker reference audio clip, and ou...

📝 → 🔊 • text-to-speech • voice-cloning • 258 runs

🤖 Model 📝 → 🔊

chenxwh/cosyvoice2-0.5b

Generate multilingual speech from text with zero-shot voice cloning. Provide a short reference audio clip and its transc...

📝 → 🔊 • text-to-speech • voice-cloning • 6.3K runs

🤖 Model 📝 → 🔊

lucataco/indextts-2

Generate expressive speech from text with zero-shot voice cloning using a reference speaker audio input. Control emotion...

📝 → 🔊 • text-to-speech • voice-cloning • emotion-control • 1.4K runs

🤖 Model 📝 → 🔊

ttsds/parlertts_tiny_1_0

Generate speech audio from text, with optional voice cloning conditioned on a reference recording. Accepts text, an opti...

📝 → 🔊 • text-to-speech • voice-cloning • speech-style-transfer • 200 runs

🤖 Model 📝 → 🔊

ttsds/fishspeech_1_4

Generate speech audio from text while cloning a target voice from a reference audio sample. Provide the text to speak, a...

📝 → 🔊 • text-to-speech • voice-cloning • 219 runs

🤖 Model 📝 → 🔊

zsxkib/hololive-style-bert-vits2

Generate Hololive VTuber-style speech from text or convert a reference audio clip into those voices. Takes text input or...

📝 → 🔊 • text-to-speech • audio-to-audio • speech-style-transfer • 886 runs

🤖 Model 📝 → 🔊

jichengdu/cosyvoice

Clone a speaker's voice and synthesize speech from text, including cross-lingual and mixed-lingual output. Accepts refer...

📝 → 🔊 • text-to-speech • voice-cloning • multilingual • 1.7K runs

🤖 Model 📝 → 🔊

ttsds/styletts2

Generate speech from text with optional voice cloning from a reference audio sample. Accepts text plus an optional speak...

📝 → 🔊 • text-to-speech • voice-cloning • speech-style-transfer • 334 runs

🤖 Model 🔊

pseudoram/rvc-v2

Convert speech to a target voice using RVC v2 voice models. Takes an input speech audio clip and outputs converted audio...

🔊 • audio-to-audio • voice-cloning • speech-style-transfer • 1.1M runs

🤖 Model 📝 → 🔊

ttsds/parlertts_mini_1_1_fixed

Generate spoken audio from text, optionally cloning a target voice from a short speaker reference audio. Accepts text as...

📝 → 🔊 • text-to-speech • voice-cloning • 193 runs

🤖 Model 📝 → 🔊

adirik/styletts2

Convert text to expressive speech, with optional speaker style cloning from a short reference audio. Accepts text input...

📝 → 🔊 • text-to-speech • voice-cloning • 131.8K runs

🤖 Model 📝 → 🔊

lucataco/neutts-air

Clone a voice from a short reference sample and synthesize new speech from text. Accepts text to speak, a 3–15s mono ref...

📝 → 🔊 • text-to-speech • voice-cloning • 168 runs

🤖 Model 📝 → 🔊

ttsds/parlertts_mini_0_1

Generate speech audio from text, with optional voice cloning from a reference speaker clip. Accepts text as the primary...

📝 → 🔊 • text-to-speech • voice-cloning • speech-style-transfer • 198 runs

🤖 Model 📝 → 🔊

ttsds/amphion_vevo

Generate speech from text with zero-shot voice cloning using a reference voice sample. Accepts text, a speaker reference...

📝 → 🔊 • text-to-speech • voice-cloning • speech-style-transfer • 499 runs

🤖 Model 🔊

rossjillian/soft-vc

Converts speech from one voice to another while preserving the original content using soft speech units. Takes an audio...

🔊 • speech-style-transfer • voice-cloning • audio-to-audio • 27 runs