x-lance/f5-tts 🔢📝🖼️✓ → 🖼️

▶️ 31.5K runs 📅 Oct 2024 ⚙️ Cog 0.9.20 🔗 GitHub 📄 Paper ⚖️ License
speech-style-transfer text-to-speech voice-cloning

About

F5-TTS, the new state-of-the-art in open source voice cloning

Example Output

Output

Example output

Performance Metrics

5.25s Prediction Time
5.26s Total Time
All Input Parameters
{
  "gen_text": "captain teemo, on duty!",
  "ref_text": "never underestimate the power of the scout's code",
  "ref_audio": "https://replicate.delivery/pbxt/LnHEJTVWhjLcpGQJTBralyztLwl8diaLyHjP2a1KXJ8dxVWv/Teemo_Original_Taunt.ogg",
  "remove_silence": true,
  "custom_split_words": ""
}
Input Parameters
speed Type: numberDefault: 1Range: 0.1 - 3
Speed of the generated audio
gen_text (required) Type: string
Text to Generate
ref_text Type: string
Reference Text
ref_audio (required) Type: string
Reference audio for voice cloning
remove_silence Type: booleanDefault: true
Automatically remove silences?
custom_split_words Type: stringDefault:
Custom split words, comma separated
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Generating: captain teemo, on duty!
[*] Converting reference audio...
[+] Converted reference audio.
[*] Using custom reference text...
[+] Reference text: never underestimate the power of the scout's code
[*] Forming batches...
[+] Formed batches: 1
------ Batch 1 -------------------
captain teemo, on duty!
--------------------------------------
0%|          | 0/1 [00:00<?, ?it/s]Building prefix dict from the default dictionary ...
DEBUG:jieba:Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
DEBUG:jieba:Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.494 seconds.
DEBUG:jieba:Loading model cost 0.494 seconds.
Prefix dict has been built successfully.
DEBUG:jieba:Prefix dict has been built successfully.
/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/vocos/pretrained.py:70: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
state_dict = torch.load(model_path, map_location="cpu")
100%|██████████| 1/1 [00:04<00:00,  4.49s/it]
100%|██████████| 1/1 [00:04<00:00,  4.49s/it]
[*] Removing silence...
[+] Removed silence
[*] Saving output.wav...
[+] Saved output.wav
Version Details
Version ID
87faf6dd7a692dd82043f662e76369cab126a2cf1937e25a9d41e0b834fd230e
Version Created
October 14, 2024
Run on Replicate →