thlz998/chat-tts ๐Ÿ“๐Ÿ”ขโ“ โ†’ โ“

โ–ถ๏ธ 3.1K runs ๐Ÿ“… Jun 2024 โš™๏ธ Cog 0.8.6 ๐Ÿ”— GitHub ๐Ÿ“„ Paper โš–๏ธ License
conversational-tts multi-speaker multilingual prosody-control text-to-speech

About

This is an implementation of the ChatTTS as a Cog model.

Example Output

Output

Performance Metrics

196.74s Prediction Time
247.45s Total Time
All Input Parameters
{
  "text": "chat T T S ๆ˜ฏไธ€ๆฌพๅผบๅคง็š„ๅฏน่ฏๅผๆ–‡ๆœฌ่ฝฌ่ฏญ้Ÿณๆจกๅž‹ใ€‚ๅฎƒๆœ‰ไธญ่‹ฑๆทท่ฏปๅ’Œๅคš่ฏด่ฏไบบ็š„่ƒฝๅŠ›ใ€‚\nchat T T S ไธไป…่ƒฝๅคŸ็”Ÿๆˆ่‡ช็„ถๆต็•…็š„่ฏญ้Ÿณ๏ผŒ่ฟ˜่ƒฝๆŽงๅˆถ[laugh]็ฌ‘ๅฃฐๅ•Š[laugh]๏ผŒ\nๅœ้กฟๅ•Š[uv_break]่ฏญๆฐ”่ฏๅ•Š็ญ‰ๅ‰ฏ่ฏญ่จ€็Žฐ่ฑก[uv_break]ใ€‚่ฟ™ไธช้Ÿตๅพ‹่ถ…่ถŠไบ†่ฎธๅคšๅผ€ๆบๆจกๅž‹[uv_break]ใ€‚\n่ฏทๆณจๆ„๏ผŒchat T T S ็š„ไฝฟ็”จๅบ”้ตๅฎˆๆณ•ๅพ‹ๅ’Œไผฆ็†ๅ‡†ๅˆ™๏ผŒ้ฟๅ…ๆปฅ็”จ็š„ๅฎ‰ๅ…จ้ฃŽ้™ฉใ€‚[uv_break]",
  "top_k": 20,
  "top_p": 0.7,
  "voice": 2222,
  "prompt": "",
  "skip_refine": 0,
  "temperature": 0.3,
  "custom_voice": 0
}
Input Parameters
text Type: stringDefault: Hello world!
Text to be synthesized
top_k Type: integerDefault: 20Range: 0 - โˆž
Top-k sampling parameter
top_p Type: numberDefault: 0.7Range: 0 - 1
Top-p sampling parameter
voice Type: integerDefault: 2222Range: 0 - โˆž
Voice identifier
prompt Type: stringDefault:
Prompt for refining text
skip_refine Default: 0
Skip refine text step
temperature Type: numberDefault: 0.3Range: 0 - 1
Temperature for sampling
custom_voice Type: integerDefault: 0Range: 0 - โˆž
Custom voice identifier
Output Schema

Output

Type: object

Example Execution Logs
voice=2222,custom_voice=0
start_time=1717318413.4744363
INFO:ChatTTS.core:All initialized.
  0%|          | 0/384 [00:00<?, ?it/s]
  0%|          | 1/384 [00:38<4:07:14, 38.73s/it]
  1%|          | 2/384 [01:18<4:10:01, 39.27s/it]
  1%|          | 3/384 [01:56<4:06:43, 38.85s/it]
  3%|โ–Ž         | 10/384 [02:33<1:10:23, 11.29s/it]
  5%|โ–Œ         | 21/384 [02:33<24:04,  3.98s/it]  
  9%|โ–Š         | 33/384 [02:33<11:36,  1.98s/it]
11%|โ–ˆ         | 43/384 [02:33<20:20,  3.58s/it]
  0%|          | 0/2048 [00:00<?, ?it/s]
  0%|          | 2/2048 [00:37<10:38:51, 18.74s/it]
  1%|          | 13/2048 [00:37<1:12:01,  2.12s/it]
  1%|          | 24/2048 [00:37<31:46,  1.06it/s]  
  2%|โ–         | 35/2048 [00:37<17:42,  1.89it/s]
  2%|โ–         | 46/2048 [00:37<10:53,  3.06it/s]
  3%|โ–Ž         | 57/2048 [00:37<07:04,  4.69it/s]
  3%|โ–Ž         | 68/2048 [00:38<04:46,  6.92it/s]
  4%|โ–         | 79/2048 [00:38<03:17,  9.95it/s]
  4%|โ–         | 90/2048 [00:38<02:20, 13.98it/s]
  5%|โ–         | 101/2048 [00:38<01:41, 19.20it/s]
  5%|โ–Œ         | 112/2048 [00:38<01:15, 25.72it/s]
  6%|โ–Œ         | 123/2048 [00:38<00:57, 33.54it/s]
  7%|โ–‹         | 134/2048 [00:38<00:45, 42.47it/s]
  7%|โ–‹         | 145/2048 [00:38<00:36, 52.03it/s]
  8%|โ–Š         | 156/2048 [00:38<00:30, 61.83it/s]
  8%|โ–Š         | 168/2048 [00:39<00:26, 71.96it/s]
  9%|โ–Š         | 179/2048 [00:39<00:23, 80.05it/s]
  9%|โ–‰         | 190/2048 [00:39<00:21, 86.88it/s]
 10%|โ–‰         | 202/2048 [00:39<00:19, 93.18it/s]
 10%|โ–ˆ         | 213/2048 [00:39<00:18, 97.53it/s]
 11%|โ–ˆ         | 224/2048 [00:39<00:18, 100.63it/s]
 11%|โ–ˆโ–        | 235/2048 [00:39<00:17, 103.13it/s]
 12%|โ–ˆโ–        | 246/2048 [00:39<00:17, 105.03it/s]
 13%|โ–ˆโ–Ž        | 258/2048 [00:39<00:16, 106.52it/s]
 13%|โ–ˆโ–Ž        | 269/2048 [00:39<00:16, 107.17it/s]
 14%|โ–ˆโ–Ž        | 280/2048 [00:40<00:16, 107.31it/s]
 14%|โ–ˆโ–        | 291/2048 [00:40<00:16, 107.40it/s]
 15%|โ–ˆโ–        | 302/2048 [00:40<00:16, 107.45it/s]
 15%|โ–ˆโ–Œ        | 313/2048 [00:40<00:16, 107.73it/s]
 16%|โ–ˆโ–Œ        | 324/2048 [00:40<00:15, 107.77it/s]
 16%|โ–ˆโ–‹        | 335/2048 [00:40<00:15, 108.04it/s]
 17%|โ–ˆโ–‹        | 346/2048 [00:40<00:15, 107.96it/s]
 17%|โ–ˆโ–‹        | 357/2048 [00:40<00:15, 108.20it/s]
 18%|โ–ˆโ–Š        | 368/2048 [00:40<00:15, 108.13it/s]
18%|โ–ˆโ–Š        | 369/2048 [00:40<03:05,  9.03it/s]
/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,
ๆŽจ็†ๆ—ถ้•ฟ: 195.29 ็ง’
้Ÿณ้ข‘ๆ—ถ้•ฟ: 29.46 ็ง’
Version Details
Version ID
fdb4f547d19c9591d7e0223c88b14886c110129c0e206ddbb97fe7a344162868
Version Created
June 2, 2024
Run on Replicate โ†’