cuuupid/zonos
Generate speech from text with optional voice cloning from a short reference audio. Accept text plus a 5β30s speaker sam...
Found 100 models (showing 81-100)
Generate speech from text with optional voice cloning from a short reference audio. Accept text plus a 5β30s speaker sam...
Convert text to speech with optional zero-shot voice cloning. Accept a text prompt and an optional speaker reference aud...
Generate speech, music, background noise, and simple sound effects from a text prompt. Output an audio file, with an opt...
Generate speech in a cloned voice from text input. Provide a reference audio clip and its transcript to capture the targ...
Clone a speakerβs voice from a 6-second sample and synthesize speech from text in Vietnamese and 17 other languages. Acc...
Convert speech or singing in an input audio clip into a target voice using RVC (Retrieval-based Voice Conversion), and o...
Clone a voice from a short reference sample and synthesize speech from text. Provide a voice sample and target text to g...
Clone a target voice and synthesize speech from text or convert reference speech to the target voice (zero-shot). Provid...
Generate Vietnamese speech from text with zero-shot voice cloning from a reference audio sample. Accepts input text, a r...
Generate up to 60 seconds of music with vocals from lyrics and a reference track. Condition on a reference song to learn...
Dub videos into 100+ languages with cloned voices. Takes a video input and returns a dubbed video with translated speech...
Train a custom RVC (Realistic Voice Cloning) voice-conversion model from an audio dataset. Input a dataset zip of segmen...
Convert text to speech with low latency for voice agents, narration, and interactive applications. Accepts text (up to 5...
Generate multilingual speech from text with preset voices, voice cloning, and voice design. Accept text plus optional la...
Generate speech from text with preset, cloned, or designed voices. Accept text as input and return spoken audio. Choose...
Convert text to speech with optional voice cloning from a reference audio sample. Accepts text and an optional speaker r...
Convert text to natural, expressive speech with sub-200ms latency. Accepts plain text (up to 2,000 characters) with SSML...
Generate speech audio from text for real-time voice agents and conversational apps. Accepts text input and outputs spoke...
Convert text to natural-sounding speech. Generate high-fidelity audio from up to 10,000 characters with 17+ preset voice...
Convert text to speech with zero-shot voice cloning from a reference audio sample. Provide target text and language (Eng...