nyxynyx/f5-tts 📝🖼️✓ → 🖼️

▶️ 22.6K runs 📅 Oct 2024 ⚙️ Cog 0.11.1

text-to-speech voice-cloning

About

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching. Voice cloning

Example Output

Output

Performance Metrics

8.40s Prediction Time

79.03s Total Time

All Input Parameters

{
  "gen_text": "When something is important enough, you do it even if the odds are not in your favor.",
  "ref_audio": "https://replicate.delivery/pbxt/Lo5PhtzOHIpE658sLaFoyibIHDYcJIngl5NaJ74dDkMYPwms/elon_musk_with_tucker_carlson_extract_02.mp3",
  "remove_silence": true,
  "custom_split_words": ""
}

Input Parameters

gen_text (required) Type: string: Text to generate speech from
ref_text Type: string: Reference text
ref_audio (required) Type: string: Reference audio for voice cloning
remove_silence Type: booleanDefault: true: Automatically remove silences?
custom_split_words Type: stringDefault:: Custom split words, comma separated

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Generating: When something is important enough, you do it even if the odds are not in your favor.
[*] Converting reference audio...
[+] Converted reference audio.
No reference text provided, transcribing reference audio...
/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/transformers/models/whisper/generation_whisper.py:496: FutureWarning: The input name `inputs` is deprecated. Please make sure to use `input_features` instead.
warnings.warn(
You have passed task=transcribe, but also have set `forced_decoder_ids` to [[1, None], [2, 50360]] which creates a conflict. `forced_decoder_ids` will be ignored in favor of task=transcribe.
Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
[+] Finished transcription
[+] Reference text: And he's writing a book now, which hopefully he'll publish soon, which is about suicidal empathy. Where you have so much empathy, you're actually suiciding society.
[*] Forming batches...
[+] Number of batches: 1
------ Batch 1 -------------------
When something is important enough, you do it even if the odds are not in your favor.
--------------------------------------
0%|          | 0/1 [00:00<?, ?it/s]Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.479 seconds.
Prefix dict has been built successfully.
/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/vocos/pretrained.py:70: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
state_dict = torch.load(model_path, map_location="cpu")
100%|██████████| 1/1 [00:04<00:00,  4.34s/it]
100%|██████████| 1/1 [00:04<00:00,  4.34s/it]
[*] Removing silence...
[*] Removed silence
[*] Saving output.wav
[*] Saved output.wav

Version Details

Version ID: e0e48acce40cb39931ed5f1b04e21492bdcf2eb0a0f96842a5e537531e86389b
Version Created: October 23, 2024

Run on Replicate →