lucataco/zeta-editing 🔢🖼️📝❓ → 🖼️

▶️ 1.8K runs 📅 Mar 2024 ⚙️ Cog 0.9.3 🔗 GitHub 📄 Paper ⚖️ License
audio-to-audio music-editing text-guided-audio-editing

About

Zero-Shot Text-Based Audio Editing Using DDPM Inversion

Example Output

Prompt:

"A recording of an arcade game soundtrack"

Output

Example output

Performance Metrics

14.98s Prediction Time
15.02s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/KVudHMNiL8E0LcDPJfRr8WzgmY5Mry4d2uhYqMyU7xseJdWr/Beethoven.wav",
  "steps": 50,
  "prompt": "A recording of an arcade game soundtrack",
  "t_start": 45,
  "audio_version": "cvssp/audioldm2-music",
  "cfg_scale_src": 3,
  "cfg_scale_tar": 12,
  "source_prompt": ""
}
Input Parameters
seed Type: integer
Random seed
audio (required) Type: string
Input Audio File
steps Type: integerDefault: 50
Number of diffusion steps, higher values(200) yield high-quality generations
prompt Type: stringDefault: A recording of an arcade game soundtrack
Describe your desired edited output
t_start Type: integerDefault: 45Range: 15 - 85
Lower % returns closer to the original audio, higher returns stronger edit
audio_version Default: cvssp/audioldm2-music
Choose the audio version to return
cfg_scale_src Type: numberDefault: 3
Source Guidance Scale
cfg_scale_tar Type: numberDefault: 12
Target Guidance Scale
source_prompt Type: stringDefault:
Optional: describe the original audio input
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 1060458061
Using model: cvssp/audioldm2-music
  0%|          | 0/50 [00:00<?, ?it/s]
  4%|▍         | 2/50 [00:00<00:03, 12.68it/s]
  8%|▊         | 4/50 [00:00<00:03, 12.36it/s]
 12%|█▏        | 6/50 [00:00<00:03, 12.40it/s]
 16%|█▌        | 8/50 [00:00<00:03, 12.40it/s]
 20%|██        | 10/50 [00:00<00:03, 12.58it/s]
 24%|██▍       | 12/50 [00:00<00:03, 12.62it/s]
 28%|██▊       | 14/50 [00:01<00:02, 12.67it/s]
 32%|███▏      | 16/50 [00:01<00:02, 12.73it/s]
 36%|███▌      | 18/50 [00:01<00:02, 12.68it/s]
 40%|████      | 20/50 [00:01<00:02, 12.67it/s]
 44%|████▍     | 22/50 [00:01<00:02, 12.57it/s]
 48%|████▊     | 24/50 [00:01<00:02, 12.56it/s]
 52%|█████▏    | 26/50 [00:02<00:01, 12.64it/s]
 56%|█████▌    | 28/50 [00:02<00:01, 12.72it/s]
 60%|██████    | 30/50 [00:02<00:01, 12.68it/s]
 64%|██████▍   | 32/50 [00:02<00:01, 12.61it/s]
 68%|██████▊   | 34/50 [00:02<00:01, 12.50it/s]
 72%|███████▏  | 36/50 [00:02<00:01, 12.56it/s]
 76%|███████▌  | 38/50 [00:03<00:00, 12.62it/s]
 80%|████████  | 40/50 [00:03<00:00, 12.62it/s]
 84%|████████▍ | 42/50 [00:03<00:00, 12.58it/s]
 88%|████████▊ | 44/50 [00:03<00:00, 12.60it/s]
 92%|█████████▏| 46/50 [00:03<00:00, 12.62it/s]
 96%|█████████▌| 48/50 [00:03<00:00, 12.63it/s]
100%|██████████| 50/50 [00:03<00:00, 12.67it/s]
100%|██████████| 50/50 [00:03<00:00, 12.61it/s]
  0%|          | 0/22 [00:00<?, ?it/s]
  5%|▍         | 1/22 [00:00<00:03,  6.44it/s]
  9%|▉         | 2/22 [00:00<00:03,  6.39it/s]
 14%|█▎        | 3/22 [00:00<00:02,  6.39it/s]
 18%|█▊        | 4/22 [00:00<00:02,  6.44it/s]
 23%|██▎       | 5/22 [00:00<00:02,  6.44it/s]
 27%|██▋       | 6/22 [00:00<00:02,  6.47it/s]
 32%|███▏      | 7/22 [00:01<00:02,  6.46it/s]
 36%|███▋      | 8/22 [00:01<00:02,  6.47it/s]
 41%|████      | 9/22 [00:01<00:02,  6.44it/s]
 45%|████▌     | 10/22 [00:01<00:01,  6.43it/s]
 50%|█████     | 11/22 [00:01<00:01,  6.40it/s]
 55%|█████▍    | 12/22 [00:01<00:01,  6.42it/s]
 59%|█████▉    | 13/22 [00:02<00:01,  6.44it/s]
 64%|██████▎   | 14/22 [00:02<00:01,  6.46it/s]
 68%|██████▊   | 15/22 [00:02<00:01,  6.47it/s]
 73%|███████▎  | 16/22 [00:02<00:00,  6.44it/s]
 77%|███████▋  | 17/22 [00:02<00:00,  6.43it/s]
 82%|████████▏ | 18/22 [00:02<00:00,  6.41it/s]
 86%|████████▋ | 19/22 [00:02<00:00,  6.44it/s]
 91%|█████████ | 20/22 [00:03<00:00,  6.45it/s]
 95%|█████████▌| 21/22 [00:03<00:00,  6.44it/s]
100%|██████████| 22/22 [00:03<00:00,  6.46it/s]
100%|██████████| 22/22 [00:03<00:00,  6.44it/s]
Version Details
Version ID
ff80c3cca3dc792e2cb119367dc11295b499cda961a2d0b34ecf5109b59c6528
Version Created
March 4, 2024
Run on Replicate →