ttsds/fishspeech_1_2 📝🖼️ → 🖼️

▶️ 247 runs 📅 Jan 2025 ⚙️ Cog 0.13.6 🔗 GitHub 📄 Paper ⚖️ License
text-to-speech voice-cloning

About

The Fish Speech V1.2 model.

Example Output

Output

Example output

Performance Metrics

5.64s Prediction Time
76.20s Total Time
All Input Parameters
{
  "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.",
  "text_reference": "and keeping eternity before the eyes, though much",
  "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav"
}
Input Parameters
text (required) Type: string
text_reference (required) Type: string
speaker_reference (required) Type: string
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
2025-01-28 17:21:00.785 | INFO     | tools.llama.generate:generate_long:432 - Encoded text: With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.
2025-01-28 17:21:00.785 | INFO     | tools.llama.generate:generate_long:450 - Generating sentence 1/1 of sample 1/1
  0%|          | 0/3892 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
warnings.warn(
  0%|          | 6/3892 [00:00<01:12, 53.45it/s]
  0%|          | 12/3892 [00:00<01:10, 54.65it/s]
  0%|          | 18/3892 [00:00<01:10, 55.23it/s]
  1%|          | 24/3892 [00:00<01:09, 55.57it/s]
  1%|          | 30/3892 [00:00<01:09, 55.69it/s]
  1%|          | 36/3892 [00:00<01:09, 55.78it/s]
  1%|          | 42/3892 [00:00<01:10, 54.58it/s]
  1%|          | 48/3892 [00:00<01:11, 53.49it/s]
  1%|▏         | 54/3892 [00:01<01:15, 50.94it/s]
  2%|▏         | 60/3892 [00:01<01:18, 49.01it/s]
  2%|▏         | 66/3892 [00:01<01:15, 50.55it/s]
  2%|▏         | 72/3892 [00:01<01:13, 51.68it/s]
  2%|▏         | 78/3892 [00:01<01:13, 51.75it/s]
  2%|▏         | 84/3892 [00:01<01:13, 51.73it/s]
  2%|▏         | 90/3892 [00:01<01:14, 50.99it/s]
  2%|▏         | 96/3892 [00:01<01:14, 50.77it/s]
  3%|▎         | 102/3892 [00:01<01:14, 50.97it/s]
  3%|▎         | 108/3892 [00:02<01:13, 51.26it/s]
  3%|▎         | 114/3892 [00:02<01:13, 51.37it/s]
  3%|▎         | 120/3892 [00:02<01:13, 51.29it/s]
  3%|▎         | 126/3892 [00:02<01:11, 52.36it/s]
  3%|▎         | 132/3892 [00:02<01:10, 53.35it/s]
  4%|▎         | 138/3892 [00:02<01:09, 54.03it/s]
  4%|▎         | 144/3892 [00:02<01:09, 54.20it/s]
  4%|▍         | 150/3892 [00:02<01:08, 54.64it/s]
  4%|▍         | 156/3892 [00:02<01:08, 54.90it/s]
  4%|▍         | 162/3892 [00:03<01:07, 55.14it/s]
  4%|▍         | 168/3892 [00:03<01:07, 55.32it/s]
  4%|▍         | 174/3892 [00:03<01:07, 55.43it/s]
  5%|▍         | 180/3892 [00:03<01:06, 55.55it/s]
  5%|▍         | 186/3892 [00:03<01:06, 55.57it/s]
  5%|▍         | 192/3892 [00:03<01:06, 55.63it/s]
  5%|▌         | 198/3892 [00:03<01:07, 55.06it/s]
  5%|▌         | 204/3892 [00:03<01:06, 55.13it/s]
  5%|▌         | 210/3892 [00:03<01:06, 54.96it/s]
  6%|▌         | 216/3892 [00:04<01:06, 55.16it/s]
  6%|▌         | 222/3892 [00:04<01:06, 55.38it/s]
  6%|▌         | 228/3892 [00:04<01:06, 55.31it/s]
  6%|▌         | 234/3892 [00:04<01:05, 55.47it/s]
  6%|▌         | 240/3892 [00:04<01:05, 55.52it/s]
  6%|▋         | 246/3892 [00:04<01:05, 55.37it/s]
  6%|▋         | 252/3892 [00:04<01:05, 55.25it/s]
7%|▋         | 255/3892 [00:04<01:07, 53.56it/s]
2025-01-28 17:21:05.732 | INFO     | tools.llama.generate:generate_long:505 - Generated 257 tokens in 4.95 seconds, 51.95 tokens/sec
2025-01-28 17:21:05.732 | INFO     | tools.llama.generate:generate_long:508 - Bandwidth achieved: 25.47 GB/s
2025-01-28 17:21:05.732 | INFO     | tools.llama.generate:generate_long:513 - GPU Memory used: 1.56 GB
/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
Next sample
Version Details
Version ID
0cfe0d652ead3df835da5a020063427419b28c66f60b322471fb5a456079659f
Version Created
January 28, 2025
Run on Replicate →