
dessix/moss-ttsd
Generate two-speaker conversational speech from a text dialogue script. Accept a script with [S1]/[S2] turns and optiona...
Found 86 models (showing 61-80)
Generate two-speaker conversational speech from a text dialogue script. Accept a script with [S1]/[S2] turns and optiona...
Convert text to speech and optionally clone a target voice from a short speaker reference audio. Accepts text plus an op...
Generate speech audio from text with voice cloning. Provide the text to speak, a speaker reference audio clip, and the t...
Clone a voice and synthesize English speech from text using a short reference audio sample. Accepts a text prompt, a bri...
Generate speech from text in a cloned voice using zero-shot or few-shot reference audio. Provide target text, a 10β30s s...
Translate speech to another language while preserving the speakerβs voice, style, pronunciation, and tone. Accepts spoke...
Clone a voice and generate speech from text using a reference audio clip and its transcript. Provide target text to spea...
Generate speech from text using a reference speaker clip for zero-shot voice cloning. Provide a text prompt and an audio...
Create an RVC v2 voice cloning dataset from a YouTube video. Provide a YouTube URL and optionally a dataset name; it dow...
Convert singing voice recordings to the timbre of a selected professional singer. Takes a source vocal audio input and o...
Generate cloned-voice speech from text using a reference audio sample. Accepts text (gen_text) and reference audio (ref_...
Generate speech from text using a reference voice. Provide the text to speak, a short speaker reference audio clip, and...
Generate speech from text with optional voice cloning from a reference audio sample. Accepts a text prompt and an option...
Generate speech from text in the voice of a reference speaker. Provide text and a speaker reference audio sample; receiv...
Synthesize speech from text using a short reference voice clip (zero-shot TTS). Clone a target speakerβs voice in Englis...
Generate speech audio from text, with optional voice cloning conditioned on a reference recording. Accepts text, an opti...
Clone a voice and synthesize speech from text in 17 languages. Provide a short reference speaker clip (at least 6 second...
Convert text to speech conditioned on a reference voice clip. Provide text, a language code (English, German, Spanish, I...
Generate multilingual speech audio from text. Optionally clone a target voice from a speaker reference audio and control...
Generate speech from text with optional voice cloning and emotion control. Accept a text prompt plus an optional 10β30s...