ttsds/fishspeech_1_4 📝🖼️ → 🖼️
About
The Fish Speech V1.4 model.
Example Output
Output
Performance Metrics
4.11s
Prediction Time
78.19s
Total Time
All Input Parameters
{ "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.", "text_reference": "and keeping eternity before the eyes, though much", "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav" }
Input Parameters
- text (required)
- text_reference (required)
- speaker_reference (required)
Output Schema
Output
Example Execution Logs
2025-01-28 18:47:27.414 | INFO | tools.llama.generate:generate_long:759 - Encoded text: With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good. 2025-01-28 18:47:27.415 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1 0%| | 0/3965 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature. warnings.warn( 0%| | 4/3965 [00:00<01:44, 37.72it/s] 0%| | 8/3965 [00:00<01:44, 38.01it/s] 0%| | 12/3965 [00:00<01:43, 38.31it/s] 0%| | 16/3965 [00:00<01:46, 37.10it/s] 1%| | 20/3965 [00:00<01:49, 35.87it/s] 1%| | 24/3965 [00:00<01:47, 36.77it/s] 1%| | 28/3965 [00:00<01:45, 37.39it/s] 1%| | 32/3965 [00:00<01:43, 37.86it/s] 1%| | 36/3965 [00:00<01:43, 38.08it/s] 1%| | 40/3965 [00:01<01:42, 38.33it/s] 1%| | 44/3965 [00:01<01:42, 38.35it/s] 1%| | 48/3965 [00:01<01:41, 38.49it/s] 1%|▏ | 52/3965 [00:01<01:41, 38.62it/s] 1%|▏ | 56/3965 [00:01<01:40, 38.71it/s] 2%|▏ | 60/3965 [00:01<01:40, 38.74it/s] 2%|▏ | 64/3965 [00:01<01:40, 38.75it/s] 2%|▏ | 68/3965 [00:01<01:40, 38.76it/s] 2%|▏ | 72/3965 [00:01<01:40, 38.74it/s] 2%|▏ | 76/3965 [00:01<01:40, 38.72it/s] 2%|▏ | 80/3965 [00:02<01:41, 38.29it/s] 2%|▏ | 84/3965 [00:02<01:41, 38.16it/s] 2%|▏ | 88/3965 [00:02<01:41, 38.27it/s] 2%|▏ | 92/3965 [00:02<01:40, 38.45it/s] 2%|▏ | 96/3965 [00:02<01:41, 38.12it/s] 3%|▎ | 100/3965 [00:02<01:40, 38.28it/s] 3%|▎ | 104/3965 [00:02<01:40, 38.39it/s] 3%|▎ | 108/3965 [00:02<01:40, 38.53it/s] 3%|▎ | 112/3965 [00:02<01:39, 38.65it/s] 3%|▎ | 116/3965 [00:03<01:39, 38.70it/s] 3%|▎ | 120/3965 [00:03<01:39, 38.66it/s] 3%|▎ | 122/3965 [00:03<01:41, 37.95it/s] 2025-01-28 18:47:30.827 | INFO | tools.llama.generate:generate_long:832 - Generated 124 tokens in 3.41 seconds, 36.34 tokens/sec 2025-01-28 18:47:30.827 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 17.97 GB/s 2025-01-28 18:47:30.827 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 1.63 GB Next sample
Version Details
- Version ID
7d55af8314c9ec4206d76c1e958cd8807c9c1bd59bffcfec363aea89e7179dd8
- Version Created
- January 28, 2025