ttsds/fishspeech_1_5 📝🖼️ → 🖼️
About
The Fish Speech V1.5 model.
Example Output
Output
Performance Metrics
                      5.18s
                      Prediction Time
                    
                  
                  
                    
                      110.80s
                      Total Time
                    
                  
                  
                All Input Parameters
{
  "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.",
  "text_reference": "and keeping eternity before the eyes, though much",
  "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav"
}
                Input Parameters
- text (required)
- text_reference (required)
- speaker_reference (required)
Output Schema
Output
Example Execution Logs
2025-01-28 18:11:53.629 | INFO | tools.llama.generate:generate_long:789 - Encoded text: With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good. 2025-01-28 18:11:53.630 | INFO | tools.llama.generate:generate_long:807 - Generating sentence 1/1 of sample 1/1 0%| | 0/8055 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature. warnings.warn( 0%| | 4/8055 [00:00<03:46, 35.48it/s] 0%| | 8/8055 [00:00<03:41, 36.25it/s] 0%| | 12/8055 [00:00<03:40, 36.52it/s] 0%| | 16/8055 [00:00<03:39, 36.67it/s] 0%| | 20/8055 [00:00<03:38, 36.78it/s] 0%| | 24/8055 [00:00<03:38, 36.70it/s] 0%| | 28/8055 [00:00<03:37, 36.85it/s] 0%| | 32/8055 [00:00<03:37, 36.81it/s] 0%| | 36/8055 [00:00<03:37, 36.84it/s] 0%| | 40/8055 [00:01<03:37, 36.88it/s] 1%| | 44/8055 [00:01<03:41, 36.10it/s] 1%| | 48/8055 [00:01<03:49, 34.91it/s] 1%| | 52/8055 [00:01<03:47, 35.15it/s] 1%| | 56/8055 [00:01<03:46, 35.34it/s] 1%| | 60/8055 [00:01<03:48, 34.99it/s] 1%| | 64/8055 [00:01<03:58, 33.51it/s] 1%| | 68/8055 [00:01<03:59, 33.34it/s] 1%| | 72/8055 [00:02<03:52, 34.33it/s] 1%| | 76/8055 [00:02<03:47, 35.08it/s] 1%| | 80/8055 [00:02<03:43, 35.62it/s] 1%| | 84/8055 [00:02<03:41, 35.98it/s] 1%| | 88/8055 [00:02<03:40, 36.07it/s] 1%| | 92/8055 [00:02<03:40, 36.06it/s] 1%| | 96/8055 [00:02<03:38, 36.34it/s] 1%| | 100/8055 [00:02<03:38, 36.43it/s] 1%|▏ | 104/8055 [00:02<03:37, 36.54it/s] 1%|▏ | 108/8055 [00:03<03:37, 36.47it/s] 1%|▏ | 112/8055 [00:03<03:38, 36.34it/s] 1%|▏ | 116/8055 [00:03<03:37, 36.50it/s] 1%|▏ | 120/8055 [00:03<03:36, 36.69it/s] 2%|▏ | 124/8055 [00:03<03:36, 36.70it/s] 2%|▏ | 128/8055 [00:03<03:36, 36.64it/s] 2%|▏ | 132/8055 [00:03<03:36, 36.54it/s] 2%|▏ | 136/8055 [00:03<03:35, 36.69it/s] 2%|▏ | 140/8055 [00:03<03:35, 36.78it/s] 2%|▏ | 144/8055 [00:03<03:34, 36.88it/s] 2%|▏ | 147/8055 [00:04<03:40, 35.85it/s] 2025-01-28 18:11:57.940 | INFO | tools.llama.generate:generate_long:861 - Generated 149 tokens in 4.31 seconds, 34.57 tokens/sec 2025-01-28 18:11:57.940 | INFO | tools.llama.generate:generate_long:864 - Bandwidth achieved: 22.06 GB/s 2025-01-28 18:11:57.940 | INFO | tools.llama.generate:generate_long:869 - GPU Memory used: 2.02 GB Next sample
Version Details
- Version ID
- f81057e21ad025b00703b8a2f63283d108829b7512f85c4c723c3edcc125f1bc
- Version Created
- January 28, 2025