ttsds/fishspeech_1_1_large 📝🖼️ → 🖼️

▶️ 236 runs 📅 Jan 2025 ⚙️ Cog 0.13.6

text-to-speech voice-cloning

About

Example Output

Output

Performance Metrics

4.44s Prediction Time

115.05s Total Time

All Input Parameters

{
  "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.",
  "text_reference": "and keeping eternity before the eyes, though much.",
  "speaker_reference": "https://replicate.delivery/pbxt/MNFXdPaUPOwYCZjZM4azsymbzE2TCV2WJXfGpeV2DrFWaSq8/example_en.wav"
}

Input Parameters

text (required) Type: string
text_reference (required) Type: string
speaker_reference (required) Type: string

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

2025-01-30 17:01:26.111 | INFO     | tools.llama.generate:generate_long:491 - Encoded text: With tenure, Suzie'd have all
2025-01-30 17:01:26.111 | INFO     | tools.llama.generate:generate_long:491 - Encoded text: the more leisure for yachting,
2025-01-30 17:01:26.112 | INFO     | tools.llama.generate:generate_long:491 - Encoded text: but her publications are no
2025-01-30 17:01:26.112 | INFO     | tools.llama.generate:generate_long:491 - Encoded text: good.
2025-01-30 17:01:26.112 | INFO     | tools.llama.generate:generate_long:509 - Generating sentence 1/4 of sample 1/1
  0%|          | 0/1857 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/backends/cuda/__init__.py:342: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see, torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
warnings.warn(
  0%|          | 6/1857 [00:00<00:35, 51.79it/s]
  1%|          | 12/1857 [00:00<00:34, 53.93it/s]
  1%|          | 18/1857 [00:00<00:33, 54.19it/s]
  1%|▏         | 24/1857 [00:00<00:33, 54.77it/s]
  2%|▏         | 30/1857 [00:00<00:33, 55.09it/s]
  2%|▏         | 36/1857 [00:00<00:32, 55.26it/s]
  2%|▏         | 42/1857 [00:00<00:32, 55.42it/s]
  3%|▎         | 48/1857 [00:00<00:32, 55.49it/s]
  3%|▎         | 54/1857 [00:00<00:32, 55.56it/s]
  3%|▎         | 60/1857 [00:01<00:32, 55.55it/s]
  4%|▎         | 66/1857 [00:01<00:32, 55.52it/s]
  4%|▍         | 72/1857 [00:01<00:32, 55.56it/s]
  4%|▍         | 78/1857 [00:01<00:32, 55.21it/s]
4%|▍         | 81/1857 [00:01<00:32, 54.48it/s]
2025-01-30 17:01:27.699 | INFO     | tools.llama.generate:generate_long:565 - Generated 83 tokens in 1.59 seconds, 52.28 tokens/sec
2025-01-30 17:01:27.700 | INFO     | tools.llama.generate:generate_long:568 - Bandwidth achieved: 53.67 GB/s
2025-01-30 17:01:27.700 | INFO     | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB
2025-01-30 17:01:27.700 | INFO     | tools.llama.generate:generate_long:509 - Generating sentence 2/4 of sample 1/1
  0%|          | 0/1726 [00:00<?, ?it/s]
  0%|          | 6/1726 [00:00<00:30, 55.57it/s]
  1%|          | 12/1726 [00:00<00:30, 55.47it/s]
  1%|          | 18/1726 [00:00<00:30, 55.52it/s]
  1%|▏         | 24/1726 [00:00<00:30, 55.57it/s]
  2%|▏         | 30/1726 [00:00<00:30, 55.51it/s]
  2%|▏         | 36/1726 [00:00<00:30, 55.52it/s]
  2%|▏         | 42/1726 [00:00<00:30, 55.54it/s]
  3%|▎         | 48/1726 [00:00<00:30, 55.53it/s]
3%|▎         | 50/1726 [00:00<00:30, 54.43it/s]
2025-01-30 17:01:28.645 | INFO     | tools.llama.generate:generate_long:565 - Generated 52 tokens in 0.94 seconds, 55.04 tokens/sec
2025-01-30 17:01:28.645 | INFO     | tools.llama.generate:generate_long:568 - Bandwidth achieved: 56.50 GB/s
2025-01-30 17:01:28.645 | INFO     | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB
2025-01-30 17:01:28.646 | INFO     | tools.llama.generate:generate_long:509 - Generating sentence 3/4 of sample 1/1
  0%|          | 0/1629 [00:00<?, ?it/s]
  0%|          | 6/1629 [00:00<00:29, 55.72it/s]
  1%|          | 12/1629 [00:00<00:29, 55.44it/s]
  1%|          | 18/1629 [00:00<00:29, 55.43it/s]
  1%|▏         | 24/1629 [00:00<00:28, 55.44it/s]
  2%|▏         | 30/1629 [00:00<00:28, 55.51it/s]
  2%|▏         | 36/1629 [00:00<00:28, 55.53it/s]
  3%|▎         | 42/1629 [00:00<00:28, 55.52it/s]
3%|▎         | 43/1629 [00:00<00:29, 54.22it/s]
2025-01-30 17:01:29.459 | INFO     | tools.llama.generate:generate_long:565 - Generated 45 tokens in 0.81 seconds, 55.34 tokens/sec
2025-01-30 17:01:29.459 | INFO     | tools.llama.generate:generate_long:568 - Bandwidth achieved: 56.81 GB/s
2025-01-30 17:01:29.459 | INFO     | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB
2025-01-30 17:01:29.459 | INFO     | tools.llama.generate:generate_long:509 - Generating sentence 4/4 of sample 1/1
  0%|          | 0/1561 [00:00<?, ?it/s]
  0%|          | 6/1561 [00:00<00:28, 54.37it/s]
  1%|          | 12/1561 [00:00<00:28, 55.05it/s]
  1%|          | 18/1561 [00:00<00:28, 54.63it/s]
1%|▏         | 22/1561 [00:00<00:29, 52.42it/s]
2025-01-30 17:01:29.899 | INFO     | tools.llama.generate:generate_long:565 - Generated 24 tokens in 0.44 seconds, 54.53 tokens/sec
2025-01-30 17:01:29.900 | INFO     | tools.llama.generate:generate_long:568 - Bandwidth achieved: 55.97 GB/s
2025-01-30 17:01:29.900 | INFO     | tools.llama.generate:generate_long:573 - GPU Memory used: 4.01 GB
/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
Next sample

Version Details

Version ID: bf25b86020c83763b6727b138d7b0c3308dd210ee059accc00cb6f1971bbcd37
Version Created: January 30, 2025

Run on Replicate →