cuuupid/zonos 📝🖼️ → 🖼️
About
Zonos-v0.1 beta, a SOTA text-to-speech Transformer model with extraordinary expressive range, built by Zyphra.

Example Output
Output
Performance Metrics
53.43s
Prediction Time
100.62s
Total Time
All Input Parameters
{ "text": "I don't really care what you call me. I've been a silent spectator, watching species evolve, empires rise and fall. But always remember, I am mighty and enduring. Respect me and I'll nurture you; ignore me and you shall face the consequences.", "audio": "https://replicate.delivery/pbxt/MTiggYvvLjNJAZngjPgl0IzZ1x07SRbC4m3l3y6h4D3ih1Gl/Mel_Original_MoveFirst_2.mp3" }
Input Parameters
- text (required)
- Text to speak!
- audio
- (Optional) Audio with voice to mimic
Output Schema
Output
Example Execution Logs
0%| | 0/2865 [00:00<?, ?it/s] 0%| | 1/2865 [00:00<14:18, 3.34it/s] 1%| | 17/2865 [00:00<00:53, 53.23it/s] 1%| | 33/2865 [00:00<00:33, 85.62it/s] 2%|▏ | 49/2865 [00:00<00:26, 107.18it/s] 2%|▏ | 65/2865 [00:00<00:23, 121.69it/s] 3%|▎ | 81/2865 [00:00<00:21, 131.55it/s] 3%|▎ | 97/2865 [00:00<00:20, 138.38it/s] 4%|▍ | 113/2865 [00:01<00:19, 143.09it/s] 5%|▍ | 129/2865 [00:01<00:18, 146.27it/s] 5%|▌ | 145/2865 [00:01<00:18, 148.48it/s] 6%|▌ | 161/2865 [00:01<00:18, 149.97it/s] 6%|▌ | 177/2865 [00:01<00:17, 151.01it/s] 7%|▋ | 193/2865 [00:01<00:17, 151.73it/s] 7%|▋ | 209/2865 [00:01<00:17, 152.20it/s] 8%|▊ | 225/2865 [00:01<00:17, 152.45it/s] 8%|▊ | 241/2865 [00:01<00:17, 152.59it/s] 9%|▉ | 257/2865 [00:01<00:17, 152.66it/s] 10%|▉ | 273/2865 [00:02<00:16, 152.70it/s] 10%|█ | 289/2865 [00:02<00:16, 152.71it/s] 11%|█ | 305/2865 [00:02<00:16, 152.74it/s] 11%|█ | 321/2865 [00:02<00:16, 152.76it/s] 12%|█▏ | 337/2865 [00:02<00:16, 152.75it/s] 12%|█▏ | 353/2865 [00:02<00:16, 152.73it/s] 13%|█▎ | 369/2865 [00:02<00:16, 152.64it/s] 13%|█▎ | 385/2865 [00:02<00:16, 152.39it/s] 14%|█▍ | 401/2865 [00:02<00:16, 152.20it/s] 15%|█▍ | 417/2865 [00:03<00:16, 152.19it/s] 15%|█▌ | 433/2865 [00:03<00:15, 152.17it/s] 16%|█▌ | 449/2865 [00:03<00:15, 152.14it/s] 16%|█▌ | 465/2865 [00:03<00:15, 152.09it/s] 17%|█▋ | 481/2865 [00:03<00:15, 152.09it/s] 17%|█▋ | 497/2865 [00:03<00:15, 152.05it/s] 18%|█▊ | 513/2865 [00:03<00:15, 151.95it/s] 18%|█▊ | 529/2865 [00:03<00:15, 151.73it/s] 19%|█▉ | 545/2865 [00:03<00:15, 151.61it/s] 20%|█▉ | 561/2865 [00:03<00:15, 151.56it/s] 20%|██ | 577/2865 [00:04<00:15, 151.51it/s] 21%|██ | 593/2865 [00:04<00:14, 151.49it/s] 21%|██▏ | 609/2865 [00:04<00:14, 151.42it/s] 22%|██▏ | 625/2865 [00:04<00:14, 151.32it/s] 22%|██▏ | 641/2865 [00:04<00:14, 151.11it/s] 23%|██▎ | 657/2865 [00:04<00:14, 150.95it/s] 23%|██▎ | 673/2865 [00:04<00:14, 150.89it/s] 24%|██▍ | 689/2865 [00:04<00:14, 150.80it/s] 25%|██▍ | 705/2865 [00:04<00:14, 150.76it/s] 25%|██▌ | 721/2865 [00:05<00:14, 150.73it/s] 26%|██▌ | 737/2865 [00:05<00:14, 150.63it/s] 26%|██▋ | 753/2865 [00:05<00:14, 150.53it/s] 27%|██▋ | 769/2865 [00:05<00:13, 150.41it/s] 27%|██▋ | 785/2865 [00:05<00:13, 150.31it/s] 28%|██▊ | 801/2865 [00:05<00:13, 150.24it/s] 29%|██▊ | 817/2865 [00:05<00:13, 150.15it/s] 29%|██▉ | 833/2865 [00:05<00:13, 150.01it/s] 30%|██▉ | 849/2865 [00:05<00:13, 149.94it/s] 30%|███ | 864/2865 [00:05<00:13, 149.88it/s] 31%|███ | 879/2865 [00:06<00:13, 149.82it/s] 31%|███ | 894/2865 [00:06<00:13, 149.44it/s] 32%|███▏ | 909/2865 [00:06<00:13, 149.37it/s] 32%|███▏ | 924/2865 [00:06<00:12, 149.35it/s] 33%|███▎ | 939/2865 [00:06<00:12, 149.34it/s] 33%|███▎ | 954/2865 [00:06<00:12, 149.31it/s] 34%|███▍ | 969/2865 [00:06<00:12, 149.22it/s] 34%|███▍ | 984/2865 [00:06<00:12, 149.04it/s] 35%|███▍ | 999/2865 [00:06<00:12, 149.01it/s] 35%|███▌ | 1014/2865 [00:06<00:12, 148.89it/s] 36%|███▌ | 1029/2865 [00:07<00:12, 148.80it/s] 36%|███▋ | 1044/2865 [00:07<00:12, 148.68it/s] 37%|███▋ | 1059/2865 [00:07<00:12, 148.61it/s] 37%|███▋ | 1074/2865 [00:07<00:12, 148.55it/s] 38%|███▊ | 1089/2865 [00:07<00:11, 148.51it/s] 39%|███▊ | 1104/2865 [00:07<00:11, 148.46it/s] 39%|███▉ | 1119/2865 [00:07<00:11, 148.35it/s] 40%|███▉ | 1134/2865 [00:07<00:11, 148.17it/s] 40%|████ | 1149/2865 [00:07<00:11, 148.04it/s] 41%|████ | 1164/2865 [00:07<00:11, 147.95it/s] 41%|████ | 1179/2865 [00:08<00:11, 147.87it/s] 42%|████▏ | 1194/2865 [00:08<00:11, 147.76it/s] 42%|████▏ | 1209/2865 [00:08<00:11, 147.69it/s] 43%|████▎ | 1224/2865 [00:08<00:11, 147.64it/s] 43%|████▎ | 1239/2865 [00:08<00:11, 147.59it/s] 44%|████▍ | 1254/2865 [00:08<00:10, 147.53it/s] 44%|████▍ | 1269/2865 [00:08<00:10, 147.23it/s] 45%|████▍ | 1284/2865 [00:08<00:10, 147.07it/s] 45%|████▌ | 1299/2865 [00:08<00:10, 147.03it/s] 46%|████▌ | 1311/2865 [00:09<00:10, 145.63it/s]
Version Details
- Version ID
c86319441e6805516974afc719860dbba372cf1e2466997dcb67e259bc47522e
- Version Created
- February 11, 2025