kjjk10/llasa-3b-long 📝🔢🖼️ → 🖼️

▶️ 1.5K runs 📅 Jan 2025 ⚙️ Cog 0.13.7 🔗 GitHub ⚖️ License
text-to-speech voice-cloning

About

SoTA Zero Shot Voice Cloning and TTS model

Example Output

Output

Example output

Performance Metrics

7.69s Prediction Time
7.70s Total Time
All Input Parameters
{
  "text": "I must not fear. Fear is the mind-killer. Fear is the little-death that brings total obliteration. I will face my fear. I will permit it to pass over me and through me. And when it has gone past I will turn the inner eye to see its path. Where the fear has gone there will be nothing. Only I will remain.",
  "prompt_text": "You open your eyes so that only a slender chink of light seeps in, and peer at the gingko trees in front of the Provincial Office. As though there, between those branches, the wind is about to take on visible form.",
  "chunk_length": 200,
  "voice_sample": "https://replicate.delivery/pbxt/MNaHFqDkZ0Y22hvppxotJazhRYe6TwhK78xAUTCoz3NB9bRV/voice_sample.wav"
}
Input Parameters
text (required) Type: string
Text to convert to speech
prompt_text Type: string
Optional prompt text. If not provided, will be extracted from voice sample using Whisper
chunk_length Type: integerDefault: 250
Length of text chunks for processing
voice_sample (required) Type: string
Voice sample audio file (16kHz)
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Processed prompts:   0%|          | 0/2 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]
Processed prompts:  50%|█████     | 1/2 [00:04<00:04,  4.92s/it, est. speed input: 146.85 toks/s, output: 85.63 toks/s]
Processed prompts: 100%|██████████| 2/2 [00:07<00:00,  3.39s/it, est. speed input: 200.58 toks/s, output: 145.08 toks/s]
Processed prompts: 100%|██████████| 2/2 [00:07<00:00,  3.62s/it, est. speed input: 200.58 toks/s, output: 145.08 toks/s]
Version Details
Version ID
0494f04972b675631af41c253a45c4341bf637f07eed9a39bad3b1fd66f73a2e
Version Created
January 24, 2025
Run on Replicate →