smaerdlatigid/stable-audio 🔢📝 → 🖼️

▶️ 39 runs 📅 Nov 2024 ⚙️ Cog 0.11.3
music-generation sound-effect-generation text-to-audio

About

Create audio clips from text

Example Output

Prompt:

"A gentle rainfall with distant thunder"

Output

Example output

Performance Metrics

17.76s Prediction Time
133.58s Total Time
All Input Parameters
{
  "cfg": 7,
  "steps": 120,
  "prompt": "A gentle rainfall with distant thunder",
  "seconds_total": 60
}
Input Parameters
cfg Type: numberDefault: 7
CFG value for the model
steps Type: integerDefault: 120
Number of steps for the model
prompt Type: stringDefault: A gentle rainfall with distant thunder
Describe the image
seconds_total Type: integerDefault: 60
Total duration in seconds
Output Schema

Output

Type: arrayItems Type: stringItems Format: uri

Example Execution Logs
Prompt received: A gentle rainfall with distant thunder
Settings: Duration=60s, Steps=120, CFG Scale=7.0
Sample rate: 44100, Sample size: 2646000
Conditioning: [{'prompt': 'A gentle rainfall with distant thunder', 'seconds_start': 0, 'seconds_total': 60}]
Generating audio...
528137451
/src/stable_audio_tools/models/conditioners.py:314: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with torch.cuda.amp.autocast(dtype=torch.float16) and torch.set_grad_enabled(self.enable_grad):
/src/stable_audio_tools/inference/sampling.py:176: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with torch.cuda.amp.autocast():
  0%|          | 0/120 [00:00<?, ?it/s]/root/.pyenv/versions/3.9.20/lib/python3.9/contextlib.py:87: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
/root/.pyenv/versions/3.9.20/lib/python3.9/site-packages/torchsde/_brownian/brownian_interval.py:608: UserWarning: Should have tb<=t1 but got tb=500.00006103515625 and t1=500.000061.
warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")
  1%|          | 1/120 [00:00<00:24,  4.81it/s]
  2%|▏         | 2/120 [00:00<00:17,  6.91it/s]
  2%|▎         | 3/120 [00:00<00:14,  8.00it/s]
  3%|▎         | 4/120 [00:00<00:13,  8.67it/s]
  4%|▍         | 5/120 [00:00<00:12,  9.06it/s]
  5%|▌         | 6/120 [00:00<00:12,  9.32it/s]
  6%|▌         | 7/120 [00:00<00:11,  9.51it/s]
  7%|▋         | 8/120 [00:00<00:11,  9.64it/s]
  8%|▊         | 9/120 [00:01<00:11,  9.73it/s]
  8%|▊         | 10/120 [00:01<00:11,  9.79it/s]
  9%|▉         | 11/120 [00:01<00:11,  9.83it/s]
 10%|█         | 12/120 [00:01<00:10,  9.84it/s]
 11%|█         | 13/120 [00:01<00:10,  9.87it/s]
 12%|█▏        | 14/120 [00:01<00:10,  9.86it/s]
 12%|█▎        | 15/120 [00:01<00:10,  9.87it/s]
 13%|█▎        | 16/120 [00:01<00:10,  9.88it/s]
 14%|█▍        | 17/120 [00:01<00:10,  9.88it/s]
 15%|█▌        | 18/120 [00:01<00:10,  9.88it/s]
 16%|█▌        | 19/120 [00:02<00:10,  9.86it/s]
 17%|█▋        | 20/120 [00:02<00:10,  9.89it/s]
 18%|█▊        | 21/120 [00:02<00:10,  9.88it/s]
 18%|█▊        | 22/120 [00:02<00:09,  9.88it/s]
 19%|█▉        | 23/120 [00:02<00:09,  9.88it/s]
 20%|██        | 24/120 [00:02<00:09,  9.88it/s]
 21%|██        | 25/120 [00:02<00:09,  9.88it/s]
 22%|██▏       | 26/120 [00:02<00:09,  9.87it/s]
 22%|██▎       | 27/120 [00:02<00:09,  9.87it/s]
 23%|██▎       | 28/120 [00:02<00:09,  9.88it/s]
 24%|██▍       | 29/120 [00:03<00:09,  9.88it/s]
 25%|██▌       | 30/120 [00:03<00:09,  9.87it/s]
 26%|██▌       | 31/120 [00:03<00:09,  9.87it/s]
 27%|██▋       | 32/120 [00:03<00:08,  9.85it/s]
 28%|██▊       | 33/120 [00:03<00:08,  9.87it/s]
 28%|██▊       | 34/120 [00:03<00:08,  9.85it/s]
 29%|██▉       | 35/120 [00:03<00:08,  9.85it/s]
 30%|███       | 36/120 [00:03<00:08,  9.85it/s]
 31%|███       | 37/120 [00:03<00:08,  9.85it/s]
 32%|███▏      | 38/120 [00:03<00:08,  9.85it/s]
 32%|███▎      | 39/120 [00:04<00:08,  9.88it/s]
 33%|███▎      | 40/120 [00:04<00:08,  9.88it/s]
 34%|███▍      | 41/120 [00:04<00:08,  9.87it/s]
 35%|███▌      | 42/120 [00:04<00:07,  9.88it/s]
 36%|███▌      | 43/120 [00:04<00:07,  9.89it/s]
 37%|███▋      | 44/120 [00:04<00:07,  9.88it/s]
 38%|███▊      | 45/120 [00:04<00:07,  9.89it/s]
 38%|███▊      | 46/120 [00:04<00:07,  9.81it/s]
 39%|███▉      | 47/120 [00:04<00:07,  9.72it/s]
 40%|████      | 48/120 [00:04<00:07,  9.73it/s]
 41%|████      | 49/120 [00:05<00:07,  9.74it/s]
 42%|████▏     | 50/120 [00:05<00:07,  9.78it/s]
 42%|████▎     | 51/120 [00:05<00:07,  9.81it/s]
 43%|████▎     | 52/120 [00:05<00:06,  9.80it/s]
 44%|████▍     | 53/120 [00:05<00:06,  9.79it/s]
 45%|████▌     | 54/120 [00:05<00:06,  9.80it/s]
 46%|████▌     | 55/120 [00:05<00:06,  9.80it/s]
 47%|████▋     | 56/120 [00:05<00:06,  9.79it/s]
 48%|████▊     | 57/120 [00:05<00:06,  9.77it/s]
 48%|████▊     | 58/120 [00:05<00:06,  9.81it/s]
 49%|████▉     | 59/120 [00:06<00:06,  9.83it/s]
 50%|█████     | 60/120 [00:06<00:06,  9.86it/s]
 51%|█████     | 61/120 [00:06<00:05,  9.87it/s]
 52%|█████▏    | 62/120 [00:06<00:05,  9.78it/s]
 52%|█████▎    | 63/120 [00:06<00:05,  9.77it/s]
 53%|█████▎    | 64/120 [00:06<00:05,  9.82it/s]
 54%|█████▍    | 65/120 [00:06<00:05,  9.84it/s]
 55%|█████▌    | 66/120 [00:06<00:05,  9.87it/s]
 56%|█████▌    | 67/120 [00:06<00:05,  9.88it/s]
 57%|█████▋    | 68/120 [00:07<00:05,  9.90it/s]
 57%|█████▊    | 69/120 [00:07<00:05,  9.92it/s]
 58%|█████▊    | 70/120 [00:07<00:05,  9.93it/s]
 59%|█████▉    | 71/120 [00:07<00:04,  9.92it/s]
 60%|██████    | 72/120 [00:07<00:04,  9.80it/s]
 61%|██████    | 73/120 [00:07<00:04,  9.86it/s]
 62%|██████▏   | 74/120 [00:07<00:04,  9.87it/s]
 62%|██████▎   | 75/120 [00:07<00:04,  9.90it/s]
 63%|██████▎   | 76/120 [00:07<00:04,  9.91it/s]
 64%|██████▍   | 77/120 [00:07<00:04,  9.90it/s]
 65%|██████▌   | 78/120 [00:08<00:04,  9.93it/s]
 66%|██████▌   | 79/120 [00:08<00:04,  9.92it/s]
 67%|██████▋   | 80/120 [00:08<00:04,  9.92it/s]
 68%|██████▊   | 81/120 [00:08<00:03,  9.91it/s]
 68%|██████▊   | 82/120 [00:08<00:03,  9.91it/s]
 69%|██████▉   | 83/120 [00:08<00:03,  9.85it/s]
 70%|███████   | 84/120 [00:08<00:03,  9.87it/s]
 71%|███████   | 85/120 [00:08<00:03,  9.89it/s]
 72%|███████▏  | 86/120 [00:08<00:03,  9.89it/s]
 72%|███████▎  | 87/120 [00:08<00:03,  9.90it/s]
 73%|███████▎  | 88/120 [00:09<00:03,  9.91it/s]
 74%|███████▍  | 89/120 [00:09<00:03,  9.92it/s]
 75%|███████▌  | 90/120 [00:09<00:03,  9.91it/s]
 76%|███████▌  | 91/120 [00:09<00:02,  9.92it/s]
 77%|███████▋  | 92/120 [00:09<00:02,  9.93it/s]
 78%|███████▊  | 93/120 [00:09<00:02,  9.93it/s]
 78%|███████▊  | 94/120 [00:09<00:02,  9.93it/s]
 79%|███████▉  | 95/120 [00:09<00:02,  9.94it/s]
 80%|████████  | 96/120 [00:09<00:02,  9.95it/s]
 81%|████████  | 97/120 [00:09<00:02,  9.96it/s]
 82%|████████▏ | 98/120 [00:10<00:02,  9.96it/s]
 83%|████████▎ | 100/120 [00:10<00:02,  9.98it/s]
 84%|████████▍ | 101/120 [00:10<00:01,  9.97it/s]
 85%|████████▌ | 102/120 [00:10<00:01,  9.97it/s]
 86%|████████▌ | 103/120 [00:10<00:01,  9.95it/s]
 87%|████████▋ | 104/120 [00:10<00:01,  9.92it/s]
 88%|████████▊ | 105/120 [00:10<00:01,  9.92it/s]
 88%|████████▊ | 106/120 [00:10<00:01,  9.93it/s]
 89%|████████▉ | 107/120 [00:10<00:01,  9.94it/s]
 91%|█████████ | 109/120 [00:11<00:01,  9.95it/s]
 92%|█████████▏| 110/120 [00:11<00:01,  9.95it/s]
 92%|█████████▎| 111/120 [00:11<00:00,  9.95it/s]
 93%|█████████▎| 112/120 [00:11<00:00,  9.94it/s]
 94%|█████████▍| 113/120 [00:11<00:00,  9.93it/s]
 96%|█████████▌| 115/120 [00:11<00:00,  9.97it/s]
 98%|█████████▊| 117/120 [00:11<00:00,  9.99it/s]
 98%|█████████▊| 118/120 [00:12<00:00,  9.99it/s]
100%|██████████| 120/120 [00:12<00:00, 10.09it/s]
100%|██████████| 120/120 [00:12<00:00,  9.81it/s]
Audio generated.
Audio rearranged.
Audio normalized and converted.
Saving audio to file: /tmp/outputs/output_9f0f5406daaf43d88fffc76dc0b62c02.wav
Audio saved: /tmp/outputs/output_9f0f5406daaf43d88fffc76dc0b62c02.wav
Failed to upload image: {'statusCode': 400, 'error': 'Duplicate', 'message': 'The resource already exists'}
Failed to upload metadata: {'statusCode': 400, 'error': 'Duplicate', 'message': 'The resource already exists'}
Version Details
Version ID
80d7a3ff48781aadfe37bd4c0c0317ffa94c67698d661f4792b1b01129a29689
Version Created
November 3, 2024
Run on Replicate →