voodoohop/stable-diffusion-dance 🔢📝🖼️❓✓ → 🖼️

▶️ 21 runs 📅 Mar 2025 ⚙️ Cog 0.14.3
audio-reactive audio-to-video text-to-video

About

Example Output

Output

Example outputExample outputExample outputExample output

Performance Metrics

142.96s Prediction Time
301.10s Total Time
All Input Parameters
{
  "width": 512,
  "height": 512,
  "prompts": "A painting of a moth\nA painting of a killer dragonfly by paul klee, intricate detail\nTwo fishes talking to eachother in deep sea, art by hieronymus bosch",
  "audio_file": "https://replicate.delivery/pbxt/MjYAIYsg89ldJjFQtKgai1Tpl9urAqeCrx2PHJCemHO8r6iY/test1.mp3",
  "batch_size": 24,
  "frame_rate": 16,
  "random_seed": 13,
  "prompt_scale": 15,
  "style_suffix": "by paul klee, intricate details",
  "audio_smoothing": 0.8,
  "diffusion_steps": 20,
  "audio_noise_scale": 0.3,
  "audio_loudness_type": "peak",
  "frame_interpolation": true
}
Input Parameters
width Type: integerDefault: 384
Width of the generated image. The model was really only trained on 512x512 images. Other sizes tend to create less coherent images.
height Type: integerDefault: 512
Height of the generated image. The model was really only trained on 512x512 images. Other sizes tend to create less coherent images.
prompts Type: stringDefault: A painting of a moth A painting of a killer dragonfly by paul klee, intricate detail Two fishes talking to eachother in deep sea, art by hieronymus bosch
audio_file Type: string
input audio file
batch_size Type: integerDefault: 24
Number of images to generate at once. Higher batch sizes will generate images faster but will use more GPU memory i.e. not work depending on resolution.
frame_rate Type: numberDefault: 16
Frames per second for the generated video.
random_seed Type: integerDefault: 13
Each seed generates a different image
prompt_scale Type: numberDefault: 15
Determines influence of your prompt on generation.
style_suffix Type: stringDefault: by paul klee, intricate details
Style suffix to add to the prompt. This can be used to add the same style to each prompt.
audio_smoothing Type: numberDefault: 0.8
Audio smoothing factor.
diffusion_steps Type: integerDefault: 20
Number of diffusion steps. Higher steps could produce better results but will take longer to generate. Maximum 30 (using K-Euler-Diffusion).
audio_noise_scale Type: numberDefault: 0.3
Larger values mean audio will lead to bigger changes in the image.
audio_loudness_type Default: peak
Type of loudness to use for audio. Options are 'rms' or 'peak'.
frame_interpolation Type: booleanDefault: true
Whether to interpolate between frames using FFMPEG or not.
Output Schema

Output

Type: arrayItems Type: stringItems Format: uri

Example Execution Logs
translated prompts ['A painting of a moth.by paul klee, intricate details', 'A painting of a moth.by paul klee, intricate details', 'A painting of a killer dragonfly by paul klee, intricate detail.by paul klee, intricate details', 'Two fishes talking to eachother in deep sea, art by hieronymus bosch.by paul klee, intricate details']
using audio file /tmp/tmph1pdxav8test1.mp3
hop length 1378 audio length 132480 audio sr 22050
length of audio intensities 97
num frames per prompt 32
Global seed set to 13
embedding prompts
prompt 0 shape torch.Size([1, 77, 768])
len smoothed_audio_intensities 97 len interpolated_prompts 96
interp prompts 0 shape torch.Size([1, 77, 768]) start_codes 0 shape torch.Size([1, 4, 64, 64])
data_batch torch.Size([24, 77, 768]) start_code_batchtorch.Size([24, 4, 64, 64])
sigmas tensor([14.6090, 10.7477,  8.0787,  6.2059,  4.8573,  3.8654,  3.1229,  2.5581,
2.1152,  1.7641,  1.4802,  1.2456,  1.0481,  0.8783,  0.7297,  0.5962,
0.4736,  0.3552,  0.2322,  0.0313,  0.0000], device='cuda:0')
  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:01<00:27,  1.43s/it]
 10%|█         | 2/20 [00:02<00:25,  1.40s/it]
 15%|█▌        | 3/20 [00:04<00:24,  1.41s/it]
 20%|██        | 4/20 [00:05<00:22,  1.42s/it]
 25%|██▌       | 5/20 [00:07<00:21,  1.42s/it]
 30%|███       | 6/20 [00:08<00:19,  1.43s/it]
 35%|███▌      | 7/20 [00:09<00:18,  1.43s/it]
 40%|████      | 8/20 [00:11<00:17,  1.43s/it]
 45%|████▌     | 9/20 [00:12<00:15,  1.43s/it]
 50%|█████     | 10/20 [00:14<00:14,  1.43s/it]
 55%|█████▌    | 11/20 [00:15<00:12,  1.43s/it]
 60%|██████    | 12/20 [00:17<00:11,  1.43s/it]
 65%|██████▌   | 13/20 [00:18<00:10,  1.43s/it]
 70%|███████   | 14/20 [00:19<00:08,  1.43s/it]
 75%|███████▌  | 15/20 [00:21<00:07,  1.43s/it]
 80%|████████  | 16/20 [00:22<00:05,  1.43s/it]
 85%|████████▌ | 17/20 [00:24<00:04,  1.44s/it]
 90%|█████████ | 18/20 [00:25<00:02,  1.44s/it]
 95%|█████████▌| 19/20 [00:27<00:01,  1.44s/it]
100%|██████████| 20/20 [00:28<00:00,  1.44s/it]
100%|██████████| 20/20 [00:28<00:00,  1.43s/it]
samples_ddim torch.Size([24, 4, 64, 64])
Saved ./outputs/00000.png
Saved ./outputs/00001.png
Saved ./outputs/00002.png
Saved ./outputs/00003.png
Saved ./outputs/00004.png
Saved ./outputs/00005.png
Saved ./outputs/00006.png
Saved ./outputs/00007.png
Saved ./outputs/00008.png
Saved ./outputs/00009.png
Saved ./outputs/00010.png
Saved ./outputs/00011.png
Saved ./outputs/00012.png
Saved ./outputs/00013.png
Saved ./outputs/00014.png
Saved ./outputs/00015.png
Saved ./outputs/00016.png
Saved ./outputs/00017.png
Saved ./outputs/00018.png
Saved ./outputs/00019.png
Saved ./outputs/00020.png
Saved ./outputs/00021.png
Saved ./outputs/00022.png
Saved ./outputs/00023.png
data_batch torch.Size([24, 77, 768]) start_code_batch torch.Size([24, 4, 64, 64])
sigmas tensor([14.6090, 10.7477,  8.0787,  6.2059,  4.8573,  3.8654,  3.1229,  2.5581,
2.1152,  1.7641,  1.4802,  1.2456,  1.0481,  0.8783,  0.7297,  0.5962,
0.4736,  0.3552,  0.2322,  0.0313,  0.0000], device='cuda:0')
  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:00<00:09,  1.99it/s]
 10%|█         | 2/20 [00:01<00:18,  1.05s/it]
 15%|█▌        | 3/20 [00:03<00:20,  1.23s/it]
 20%|██        | 4/20 [00:04<00:20,  1.31s/it]
 25%|██▌       | 5/20 [00:06<00:20,  1.36s/it]
 30%|███       | 6/20 [00:07<00:19,  1.38s/it]
 35%|███▌      | 7/20 [00:09<00:18,  1.40s/it]
 40%|████      | 8/20 [00:10<00:16,  1.41s/it]
 45%|████▌     | 9/20 [00:11<00:15,  1.42s/it]
 50%|█████     | 10/20 [00:13<00:14,  1.43s/it]
 55%|█████▌    | 11/20 [00:14<00:12,  1.43s/it]
 60%|██████    | 12/20 [00:16<00:11,  1.43s/it]
 65%|██████▌   | 13/20 [00:17<00:10,  1.44s/it]
 70%|███████   | 14/20 [00:19<00:08,  1.44s/it]
 75%|███████▌  | 15/20 [00:20<00:07,  1.44s/it]
 80%|████████  | 16/20 [00:22<00:05,  1.44s/it]
 85%|████████▌ | 17/20 [00:23<00:04,  1.44s/it]
 90%|█████████ | 18/20 [00:24<00:02,  1.44s/it]
 95%|█████████▌| 19/20 [00:26<00:01,  1.44s/it]
100%|██████████| 20/20 [00:27<00:00,  1.44s/it]
100%|██████████| 20/20 [00:27<00:00,  1.39s/it]
samples_ddim torch.Size([24, 4, 64, 64])
Saved ./outputs/00024.png
Saved ./outputs/00025.png
Saved ./outputs/00026.png
Saved ./outputs/00027.png
Saved ./outputs/00028.png
Saved ./outputs/00029.png
Saved ./outputs/00030.png
Saved ./outputs/00031.png
Saved ./outputs/00032.png
Saved ./outputs/00033.png
Saved ./outputs/00034.png
Saved ./outputs/00035.png
Saved ./outputs/00036.png
Saved ./outputs/00037.png
Saved ./outputs/00038.png
Saved ./outputs/00039.png
Saved ./outputs/00040.png
Saved ./outputs/00041.png
Saved ./outputs/00042.png
Saved ./outputs/00043.png
Saved ./outputs/00044.png
Saved ./outputs/00045.png
Saved ./outputs/00046.png
Saved ./outputs/00047.png
data_batch torch.Size([24, 77, 768]) start_code_batch torch.Size([24, 4, 64, 64])
sigmas tensor([14.6090, 10.7477,  8.0787,  6.2059,  4.8573,  3.8654,  3.1229,  2.5581,
2.1152,  1.7641,  1.4802,  1.2456,  1.0481,  0.8783,  0.7297,  0.5962,
0.4736,  0.3552,  0.2322,  0.0313,  0.0000], device='cuda:0')
  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:00<00:09,  1.99it/s]
 10%|█         | 2/20 [00:01<00:19,  1.06s/it]
 15%|█▌        | 3/20 [00:03<00:20,  1.23s/it]
 20%|██        | 4/20 [00:04<00:21,  1.32s/it]
 25%|██▌       | 5/20 [00:06<00:20,  1.36s/it]
 30%|███       | 6/20 [00:07<00:19,  1.39s/it]
 35%|███▌      | 7/20 [00:09<00:18,  1.41s/it]
 40%|████      | 8/20 [00:10<00:17,  1.42s/it]
 45%|████▌     | 9/20 [00:12<00:15,  1.42s/it]
 50%|█████     | 10/20 [00:13<00:14,  1.43s/it]
 55%|█████▌    | 11/20 [00:14<00:12,  1.43s/it]
 60%|██████    | 12/20 [00:16<00:11,  1.43s/it]
 65%|██████▌   | 13/20 [00:17<00:10,  1.44s/it]
 70%|███████   | 14/20 [00:19<00:08,  1.44s/it]
 75%|███████▌  | 15/20 [00:20<00:07,  1.44s/it]
 80%|████████  | 16/20 [00:22<00:05,  1.44s/it]
 85%|████████▌ | 17/20 [00:23<00:04,  1.44s/it]
 90%|█████████ | 18/20 [00:25<00:02,  1.44s/it]
 95%|█████████▌| 19/20 [00:26<00:01,  1.44s/it]
100%|██████████| 20/20 [00:27<00:00,  1.44s/it]
100%|██████████| 20/20 [00:27<00:00,  1.39s/it]
samples_ddim torch.Size([24, 4, 64, 64])
Saved ./outputs/00048.png
Saved ./outputs/00049.png
Saved ./outputs/00050.png
Saved ./outputs/00051.png
Saved ./outputs/00052.png
Saved ./outputs/00053.png
Saved ./outputs/00054.png
Saved ./outputs/00055.png
Saved ./outputs/00056.png
Saved ./outputs/00057.png
Saved ./outputs/00058.png
Saved ./outputs/00059.png
Saved ./outputs/00060.png
Saved ./outputs/00061.png
Saved ./outputs/00062.png
Saved ./outputs/00063.png
Saved ./outputs/00064.png
Saved ./outputs/00065.png
Saved ./outputs/00066.png
Saved ./outputs/00067.png
Saved ./outputs/00068.png
Saved ./outputs/00069.png
Saved ./outputs/00070.png
Saved ./outputs/00071.png
data_batch torch.Size([24, 77, 768]) start_code_batch torch.Size([24, 4, 64, 64])
sigmas tensor([14.6090, 10.7477,  8.0787,  6.2059,  4.8573,  3.8654,  3.1229,  2.5581,
2.1152,  1.7641,  1.4802,  1.2456,  1.0481,  0.8783,  0.7297,  0.5962,
0.4736,  0.3552,  0.2322,  0.0313,  0.0000], device='cuda:0')
  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:00<00:09,  1.98it/s]
 10%|█         | 2/20 [00:01<00:19,  1.06s/it]
 15%|█▌        | 3/20 [00:03<00:20,  1.23s/it]
 20%|██        | 4/20 [00:04<00:21,  1.36s/it]
 25%|██▌       | 5/20 [00:06<00:20,  1.37s/it]
 30%|███       | 6/20 [00:07<00:19,  1.40s/it]
 35%|███▌      | 7/20 [00:09<00:18,  1.41s/it]
 40%|████      | 8/20 [00:10<00:17,  1.42s/it]
 45%|████▌     | 9/20 [00:12<00:15,  1.43s/it]
 50%|█████     | 10/20 [00:13<00:14,  1.43s/it]
 55%|█████▌    | 11/20 [00:15<00:12,  1.44s/it]
 60%|██████    | 12/20 [00:16<00:11,  1.44s/it]
 65%|██████▌   | 13/20 [00:17<00:10,  1.44s/it]
 70%|███████   | 14/20 [00:19<00:08,  1.44s/it]
 75%|███████▌  | 15/20 [00:20<00:07,  1.44s/it]
 80%|████████  | 16/20 [00:22<00:05,  1.44s/it]
 85%|████████▌ | 17/20 [00:23<00:04,  1.44s/it]
 90%|█████████ | 18/20 [00:25<00:02,  1.44s/it]
 95%|█████████▌| 19/20 [00:26<00:01,  1.44s/it]
100%|██████████| 20/20 [00:27<00:00,  1.44s/it]
100%|██████████| 20/20 [00:27<00:00,  1.40s/it]
samples_ddim torch.Size([24, 4, 64, 64])
Saved ./outputs/00072.png
Saved ./outputs/00073.png
Saved ./outputs/00074.png
Saved ./outputs/00075.png
Saved ./outputs/00076.png
Saved ./outputs/00077.png
Saved ./outputs/00078.png
Saved ./outputs/00079.png
Saved ./outputs/00080.png
Saved ./outputs/00081.png
Saved ./outputs/00082.png
Saved ./outputs/00083.png
Saved ./outputs/00084.png
Saved ./outputs/00085.png
Saved ./outputs/00086.png
Saved ./outputs/00087.png
Saved ./outputs/00088.png
Saved ./outputs/00089.png
Saved ./outputs/00090.png
Saved ./outputs/00091.png
Saved ./outputs/00092.png
Saved ./outputs/00093.png
Saved ./outputs/00094.png
Saved ./outputs/00095.png
Your samples have been saved to:
./outputs
Enjoy.
total 61672
-rw-r--r-- 1 root root 642004 Mar 27 18:15 00000.png
-rw-r--r-- 1 root root 638277 Mar 27 18:15 00001.png
-rw-r--r-- 1 root root 636290 Mar 27 18:15 00002.png
-rw-r--r-- 1 root root 646528 Mar 27 18:15 00003.png
-rw-r--r-- 1 root root 642303 Mar 27 18:15 00004.png
-rw-r--r-- 1 root root 646229 Mar 27 18:15 00005.png
-rw-r--r-- 1 root root 646045 Mar 27 18:15 00006.png
-rw-r--r-- 1 root root 640138 Mar 27 18:15 00007.png
-rw-r--r-- 1 root root 640645 Mar 27 18:15 00008.png
-rw-r--r-- 1 root root 644376 Mar 27 18:15 00009.png
-rw-r--r-- 1 root root 651647 Mar 27 18:15 00010.png
-rw-r--r-- 1 root root 645823 Mar 27 18:15 00011.png
-rw-r--r-- 1 root root 636216 Mar 27 18:15 00012.png
-rw-r--r-- 1 root root 638321 Mar 27 18:15 00013.png
-rw-r--r-- 1 root root 640819 Mar 27 18:15 00014.png
-rw-r--r-- 1 root root 642209 Mar 27 18:15 00015.png
-rw-r--r-- 1 root root 635924 Mar 27 18:15 00016.png
-rw-r--r-- 1 root root 638007 Mar 27 18:15 00017.png
-rw-r--r-- 1 root root 639980 Mar 27 18:15 00018.png
-rw-r--r-- 1 root root 651785 Mar 27 18:15 00019.png
-rw-r--r-- 1 root root 649632 Mar 27 18:15 00020.png
-rw-r--r-- 1 root root 644949 Mar 27 18:15 00021.png
-rw-r--r-- 1 root root 636635 Mar 27 18:15 00022.png
-rw-r--r-- 1 root root 637092 Mar 27 18:15 00023.png
-rw-r--r-- 1 root root 640738 Mar 27 18:16 00024.png
-rw-r--r-- 1 root root 645466 Mar 27 18:16 00025.png
-rw-r--r-- 1 root root 641979 Mar 27 18:16 00026.png
-rw-r--r-- 1 root root 643876 Mar 27 18:16 00027.png
-rw-r--r-- 1 root root 643556 Mar 27 18:16 00028.png
-rw-r--r-- 1 root root 642177 Mar 27 18:16 00029.png
-rw-r--r-- 1 root root 644385 Mar 27 18:16 00030.png
-rw-r--r-- 1 root root 641750 Mar 27 18:16 00031.png
-rw-r--r-- 1 root root 641641 Mar 27 18:16 00032.png
-rw-r--r-- 1 root root 655825 Mar 27 18:16 00033.png
-rw-r--r-- 1 root root 651429 Mar 27 18:16 00034.png
-rw-r--r-- 1 root root 654648 Mar 27 18:16 00035.png
-rw-r--r-- 1 root root 646665 Mar 27 18:16 00036.png
-rw-r--r-- 1 root root 649182 Mar 27 18:16 00037.png
-rw-r--r-- 1 root root 646929 Mar 27 18:16 00038.png
-rw-r--r-- 1 root root 648801 Mar 27 18:16 00039.png
-rw-r--r-- 1 root root 650807 Mar 27 18:16 00040.png
-rw-r--r-- 1 root root 651238 Mar 27 18:16 00041.png
-rw-r--r-- 1 root root 654080 Mar 27 18:16 00042.png
-rw-r--r-- 1 root root 657997 Mar 27 18:16 00043.png
-rw-r--r-- 1 root root 654062 Mar 27 18:16 00044.png
-rw-r--r-- 1 root root 645044 Mar 27 18:16 00045.png
-rw-r--r-- 1 root root 649169 Mar 27 18:16 00046.png
-rw-r--r-- 1 root root 649032 Mar 27 18:16 00047.png
-rw-r--r-- 1 root root 648941 Mar 27 18:16 00048.png
-rw-r--r-- 1 root root 651319 Mar 27 18:16 00049.png
-rw-r--r-- 1 root root 647694 Mar 27 18:16 00050.png
-rw-r--r-- 1 root root 654670 Mar 27 18:16 00051.png
-rw-r--r-- 1 root root 658213 Mar 27 18:16 00052.png
-rw-r--r-- 1 root root 655096 Mar 27 18:16 00053.png
-rw-r--r-- 1 root root 654868 Mar 27 18:16 00054.png
-rw-r--r-- 1 root root 647661 Mar 27 18:16 00055.png
-rw-r--r-- 1 root root 648491 Mar 27 18:16 00056.png
-rw-r--r-- 1 root root 647378 Mar 27 18:16 00057.png
-rw-r--r-- 1 root root 653648 Mar 27 18:16 00058.png
-rw-r--r-- 1 root root 649479 Mar 27 18:16 00059.png
-rw-r--r-- 1 root root 648430 Mar 27 18:16 00060.png
-rw-r--r-- 1 root root 648655 Mar 27 18:16 00061.png
-rw-r--r-- 1 root root 651449 Mar 27 18:16 00062.png
-rw-r--r-- 1 root root 652229 Mar 27 18:16 00063.png
-rw-r--r-- 1 root root 648212 Mar 27 18:16 00064.png
-rw-r--r-- 1 root root 665119 Mar 27 18:16 00065.png
-rw-r--r-- 1 root root 667312 Mar 27 18:16 00066.png
-rw-r--r-- 1 root root 670765 Mar 27 18:16 00067.png
-rw-r--r-- 1 root root 686186 Mar 27 18:16 00068.png
-rw-r--r-- 1 root root 668374 Mar 27 18:16 00069.png
-rw-r--r-- 1 root root 660704 Mar 27 18:16 00070.png
-rw-r--r-- 1 root root 669603 Mar 27 18:16 00071.png
-rw-r--r-- 1 root root 666823 Mar 27 18:17 00072.png
-rw-r--r-- 1 root root 666856 Mar 27 18:17 00073.png
-rw-r--r-- 1 root root 669827 Mar 27 18:17 00074.png
-rw-r--r-- 1 root root 674255 Mar 27 18:17 00075.png
-rw-r--r-- 1 root root 672226 Mar 27 18:17 00076.png
-rw-r--r-- 1 root root 674705 Mar 27 18:17 00077.png
-rw-r--r-- 1 root root 674792 Mar 27 18:17 00078.png
-rw-r--r-- 1 root root 680698 Mar 27 18:17 00079.png
-rw-r--r-- 1 root root 675547 Mar 27 18:17 00080.png
-rw-r--r-- 1 root root 676059 Mar 27 18:17 00081.png
-rw-r--r-- 1 root root 676981 Mar 27 18:17 00082.png
-rw-r--r-- 1 root root 672135 Mar 27 18:17 00083.png
-rw-r--r-- 1 root root 671898 Mar 27 18:17 00084.png
-rw-r--r-- 1 root root 677357 Mar 27 18:17 00085.png
-rw-r--r-- 1 root root 682207 Mar 27 18:17 00086.png
-rw-r--r-- 1 root root 684609 Mar 27 18:17 00087.png
-rw-r--r-- 1 root root 683231 Mar 27 18:17 00088.png
-rw-r--r-- 1 root root 681440 Mar 27 18:17 00089.png
-rw-r--r-- 1 root root 682773 Mar 27 18:17 00090.png
-rw-r--r-- 1 root root 682412 Mar 27 18:17 00091.png
-rw-r--r-- 1 root root 684271 Mar 27 18:17 00092.png
-rw-r--r-- 1 root root 675812 Mar 27 18:17 00093.png
-rw-r--r-- 1 root root 676336 Mar 27 18:17 00094.png
-rw-r--r-- 1 root root 675458 Mar 27 18:17 00095.png
Thu Mar 27 18:17:28 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03             Driver Version: 550.144.03     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L40S                    On  |   00000000:C1:00.0 Off |                    0 |
| N/A   64C    P0            127W /  350W |   43607MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
total time 139.44789123535156
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
[image2 @ 0x645ff26b4a00] Pattern type 'glob_sequence' is deprecated: use pattern_type 'glob' instead
Input #0, image2, from './outputs/%*.png':
Duration: 00:00:03.84, start: 0.000000, bitrate: N/A
Stream #0:0: Video: png, rgb24(pc), 512x512, 25 fps, 25 tbr, 25 tbn, 25 tbc
[mp3 @ 0x645ff26bba80] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from '/tmp/tmph1pdxav8test1.mp3':
Duration: 00:00:06.01, start: 0.000000, bitrate: 320 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (png (native) -> h264 (libx264))
Stream #1:0 -> #0:1 (mp3 (mp3float) -> aac (native))
Press [q] to stop, [?] for help
[image2 @ 0x645ff26b4a00] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8)
[libx264 @ 0x645ff26c2100] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0x645ff26c2100] profile High, level 2.2, 4:2:0, 8-bit
[libx264 @ 0x645ff26c2100] 264 - core 163 r3060 5db6aa6 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=5 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=8 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=15 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=16 scenecut=40 intra_refresh=0 rc_lookahead=50 rc=crf mbtree=1 crf=20.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/tmp/z_interpollation.mp4':
Metadata:
encoder         : Lavf58.76.100
Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(tv, progressive), 512x512, q=2-31, 16 fps, 16384 tbn
Metadata:
encoder         : Lavc58.134.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s
Metadata:
encoder         : Lavc58.134.100 aac
frame=    1 fps=0.0 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x
frame=   87 fps=0.0 q=24.0 size=    1024kB time=00:00:00.93 bitrate=8947.7kbits/s speed=1.76x
[mp4 @ 0x645ff26c0f40] Starting second pass: moving the moov atom to the beginning of the file
frame=   96 fps= 57 q=-1.0 Lsize=    7008kB time=00:00:05.99 bitrate=9583.0kbits/s speed=3.56x
video:6906kB audio:97kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.070338%
[libx264 @ 0x645ff26c2100] frame I:4     Avg QP:23.38  size:118358
[libx264 @ 0x645ff26c2100] frame P:26    Avg QP:25.31  size: 91010
[libx264 @ 0x645ff26c2100] frame B:66    Avg QP:26.27  size: 64109
[libx264 @ 0x645ff26c2100] consecutive B-frames:  7.3%  0.0%  9.4% 83.3%
[libx264 @ 0x645ff26c2100] mb I  I16..4:  0.0% 22.4% 77.5%
[libx264 @ 0x645ff26c2100] mb P  I16..4:  0.8% 19.5% 23.3%  P16..4: 18.8% 22.6% 15.0%  0.0%  0.0%    skip: 0.0%
[libx264 @ 0x645ff26c2100] mb B  I16..4:  0.3%  6.0%  4.6%  B16..8: 27.7% 22.9% 13.4%  direct:23.1%  skip: 2.0%  L0:31.9% L1:31.4% BI:36.6%
[libx264 @ 0x645ff26c2100] 8x8 transform intra:44.1% inter:29.5%
[libx264 @ 0x645ff26c2100] direct mvs  spatial:84.8% temporal:15.2%
[libx264 @ 0x645ff26c2100] coded y,uvDC,uvAC intra: 99.9% 97.9% 90.1% inter: 93.4% 87.7% 31.8%
[libx264 @ 0x645ff26c2100] i16 v,h,dc,p:  2%  9% 76% 14%
[libx264 @ 0x645ff26c2100] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 14%  9% 29%  8%  7%  9%  6%  9%  9%
[libx264 @ 0x645ff26c2100] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu:  8%  7% 17% 11% 11% 12% 10% 12% 12%
[libx264 @ 0x645ff26c2100] i8c dc,h,v,p: 41% 20% 21% 18%
[libx264 @ 0x645ff26c2100] Weighted P-Frames: Y:38.5% UV:38.5%
[libx264 @ 0x645ff26c2100] ref P L0: 42.8% 17.6% 12.9%  9.9%  7.2%  9.7%
[libx264 @ 0x645ff26c2100] ref B L0: 71.5% 19.1%  6.2%  3.1%
[libx264 @ 0x645ff26c2100] ref B L1: 90.0% 10.0%
[libx264 @ 0x645ff26c2100] kb/s:9427.79
[aac @ 0x645ff26cf700] Qavg: 942.794
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/z_interpollation.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.76.100
Duration: 00:00:06.01, start: 0.000000, bitrate: 9553 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 512x512, 9428 kb/s, 16 fps, 16 tbr, 16384 tbn, 32 tbc (default)
Metadata:
handler_name    : VideoHandler
vendor_id       : [0][0][0][0]
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 132 kb/s (default)
Metadata:
handler_name    : SoundHandler
vendor_id       : [0][0][0][0]
Multiple -filter, -af or -vf options specified for stream 0, only the last option '-filter:v format=yuv420p' will be used.
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Stream #0:1 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x5ec858f3e640] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0x5ec858f3e640] profile High, level 2.2, 4:2:0, 8-bit
[libx264 @ 0x5ec858f3e640] 264 - core 163 r3060 5db6aa6 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=5 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=8 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=15 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=16 scenecut=40 intra_refresh=0 rc_lookahead=50 rc=crf mbtree=1 crf=20.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/tmp/z_interpollation_60fps.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.76.100
Stream #0:0(und): Video: h264 (avc1 / 0x31637661), yuv420p(progressive), 512x512, q=2-31, 16 fps, 16384 tbn (default)
Metadata:
handler_name    : VideoHandler
vendor_id       : [0][0][0][0]
encoder         : Lavc58.134.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name    : SoundHandler
vendor_id       : [0][0][0][0]
encoder         : Lavc58.134.100 aac
frame=    1 fps=0.0 q=0.0 size=       0kB time=00:00:00.88 bitrate=   0.4kbits/s speed=37.9x
frame=   90 fps=0.0 q=24.0 size=    1280kB time=00:00:05.94 bitrate=1764.1kbits/s speed=11.2x
[mp4 @ 0x5ec858f4b740] Starting second pass: moving the moov atom to the beginning of the file
frame=   96 fps= 62 q=-1.0 Lsize=    6889kB time=00:00:05.99 bitrate=9419.7kbits/s speed=3.88x
video:6785kB audio:98kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.071445%
[libx264 @ 0x5ec858f3e640] frame I:4     Avg QP:23.33  size:117846
[libx264 @ 0x5ec858f3e640] frame P:26    Avg QP:25.24  size: 89933
[libx264 @ 0x5ec858f3e640] frame B:66    Avg QP:26.22  size: 62692
[libx264 @ 0x5ec858f3e640] consecutive B-frames:  7.3%  0.0%  9.4% 83.3%
[libx264 @ 0x5ec858f3e640] mb I  I16..4:  0.0% 21.7% 78.3%
[libx264 @ 0x5ec858f3e640] mb P  I16..4:  0.9% 19.3% 24.2%  P16..4: 19.9% 20.3% 15.4%  0.0%  0.0%    skip: 0.0%
[libx264 @ 0x5ec858f3e640] mb B  I16..4:  0.3%  5.8%  5.0%  B16..8: 28.1% 21.7% 13.9%  direct:22.5%  skip: 2.6%  L0:32.9% L1:32.8% BI:34.3%
[libx264 @ 0x5ec858f3e640] 8x8 transform intra:42.5% inter:27.4%
[libx264 @ 0x5ec858f3e640] direct mvs  spatial:84.8% temporal:15.2%
[libx264 @ 0x5ec858f3e640] coded y,uvDC,uvAC intra: 99.9% 97.1% 86.0% inter: 92.4% 84.7% 28.7%
[libx264 @ 0x5ec858f3e640] i16 v,h,dc,p:  2%  8% 73% 17%
[libx264 @ 0x5ec858f3e640] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 13%  9% 29%  8%  7%  9%  7%  9%  9%
[libx264 @ 0x5ec858f3e640] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu:  8%  7% 18% 11% 11% 12%  9% 12% 12%
[libx264 @ 0x5ec858f3e640] i8c dc,h,v,p: 42% 20% 22% 16%
[libx264 @ 0x5ec858f3e640] Weighted P-Frames: Y:38.5% UV:38.5%
[libx264 @ 0x5ec858f3e640] ref P L0: 42.8% 17.3% 13.1%  9.8%  7.2%  9.8%
[libx264 @ 0x5ec858f3e640] ref B L0: 71.6% 19.0%  6.2%  3.2%
[libx264 @ 0x5ec858f3e640] ref B L1: 90.3%  9.7%
[libx264 @ 0x5ec858f3e640] kb/s:9263.11
[aac @ 0x5ec858f9c300] Qavg: 937.715
Version Details
Version ID
2ea07a2bdff3ff5f02c8a91b71fa627786b243174ca4a9ee10a0b00bd5771890
Version Created
March 27, 2025
Run on Replicate →