gauravk95/sadtalker-video ✓❓🖼️ → 🖼️

▶️ 1.4K runs 📅 Jan 2024 ⚙️ Cog 0.8.6 🔗 GitHub
lipsync video-editing

About

Make your video talk anything

Example Output

Output

Performance Metrics

85.47s Prediction Time
298.66s Total Time
All Input Parameters
{
  "use_DAIN": false,
  "enhancer_region": "lip",
  "audio_input_path": "https://replicate.delivery/pbxt/KCCITEBY84VLXxmQTovjbsq0ruQw8kJ3hcbTyvVf0ukEsJQj/chinese_poem1.wav",
  "video_input_path": "https://replicate.delivery/pbxt/KCCITJNxqqvNnw2w5Qp0799vPPFy5qh3TZSm0gYo9lyvilQw/1.mp4"
}
Input Parameters
use_DAIN Type: booleanDefault: false
Enable Depth-Aware Video Frame Interpolation
enhancer_region Default: lip
Choose a face enhancer region
audio_input_path (required) Type: string
Upload the driven audio, accepts .wav and .mp4 file
video_input_path (required) Type: string
Upload the source video usually a .mp4 file
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Start Predictions...
3DMM Extraction for source image
landmark Det::   0%|          | 0/135 [00:00<?, ?it/s]
landmark Det::   1%|          | 1/135 [00:09<21:17,  9.53s/it]
landmark Det::   1%|▏         | 2/135 [00:13<13:33,  6.11s/it]
landmark Det::   6%|▌         | 8/135 [00:13<02:10,  1.03s/it]
landmark Det::  10%|█         | 14/135 [00:13<00:58,  2.08it/s]
landmark Det::  15%|█▍        | 20/135 [00:13<00:32,  3.59it/s]
landmark Det::  19%|█▉        | 26/135 [00:13<00:19,  5.60it/s]
landmark Det::  24%|██▎       | 32/135 [00:13<00:12,  8.20it/s]
landmark Det::  28%|██▊       | 38/135 [00:13<00:08, 11.45it/s]
landmark Det::  33%|███▎      | 44/135 [00:14<00:05, 15.38it/s]
landmark Det::  37%|███▋      | 50/135 [00:14<00:04, 19.72it/s]
landmark Det::  41%|████▏     | 56/135 [00:14<00:03, 24.29it/s]
landmark Det::  46%|████▌     | 62/135 [00:14<00:02, 28.92it/s]
landmark Det::  50%|█████     | 68/135 [00:14<00:02, 33.42it/s]
landmark Det::  55%|█████▍    | 74/135 [00:14<00:01, 37.22it/s]
landmark Det::  59%|█████▉    | 80/135 [00:14<00:01, 41.26it/s]
landmark Det::  64%|██████▎   | 86/135 [00:14<00:01, 43.59it/s]
landmark Det::  68%|██████▊   | 92/135 [00:15<00:00, 45.27it/s]
landmark Det::  73%|███████▎  | 98/135 [00:15<00:00, 47.25it/s]
landmark Det::  77%|███████▋  | 104/135 [00:15<00:00, 48.61it/s]
landmark Det::  81%|████████▏ | 110/135 [00:15<00:00, 50.01it/s]
landmark Det::  86%|████████▌ | 116/135 [00:15<00:00, 50.81it/s]
landmark Det::  90%|█████████ | 122/135 [00:15<00:00, 50.96it/s]
landmark Det::  95%|█████████▍| 128/135 [00:15<00:00, 50.16it/s]
landmark Det::  99%|█████████▉| 134/135 [00:15<00:00, 49.98it/s]
landmark Det:: 100%|██████████| 135/135 [00:15<00:00,  8.52it/s]
3DMM Extraction In Video::   0%|          | 0/135 [00:00<?, ?it/s]
3DMM Extraction In Video::   1%|          | 1/135 [00:00<00:29,  4.59it/s]
3DMM Extraction In Video::   9%|▉         | 12/135 [00:00<00:02, 45.03it/s]
3DMM Extraction In Video::  17%|█▋        | 23/135 [00:00<00:01, 66.70it/s]
3DMM Extraction In Video::  24%|██▍       | 33/135 [00:00<00:01, 76.58it/s]
3DMM Extraction In Video::  33%|███▎      | 44/135 [00:00<00:01, 85.78it/s]
3DMM Extraction In Video::  41%|████      | 55/135 [00:00<00:00, 91.91it/s]
3DMM Extraction In Video::  50%|████▉     | 67/135 [00:00<00:00, 98.35it/s]
3DMM Extraction In Video::  58%|█████▊    | 78/135 [00:00<00:00, 100.93it/s]
3DMM Extraction In Video::  66%|██████▌   | 89/135 [00:01<00:00, 102.79it/s]
3DMM Extraction In Video::  75%|███████▍  | 101/135 [00:01<00:00, 105.55it/s]
3DMM Extraction In Video::  83%|████████▎ | 112/135 [00:01<00:00, 105.16it/s]
3DMM Extraction In Video::  92%|█████████▏| 124/135 [00:01<00:00, 107.35it/s]
3DMM Extraction In Video:: 100%|██████████| 135/135 [00:01<00:00, 91.50it/s]
mel::   0%|          | 0/136 [00:00<?, ?it/s]
mel:: 100%|██████████| 136/136 [00:00<00:00, 39123.82it/s]
audio2exp::   0%|          | 0/14 [00:00<?, ?it/s]
audio2exp::  57%|█████▋    | 8/14 [00:00<00:00, 78.85it/s]
audio2exp:: 100%|██████████| 14/14 [00:00<00:00, 81.42it/s]
Face Renderer::   0%|          | 0/136 [00:00<?, ?it/s]
Face Renderer::   1%|          | 1/136 [00:00<01:46,  1.27it/s]
Face Renderer::   2%|▏         | 3/136 [00:00<00:32,  4.07it/s]
Face Renderer::   4%|▎         | 5/136 [00:01<00:19,  6.58it/s]
Face Renderer::   5%|▌         | 7/136 [00:01<00:14,  9.11it/s]
Face Renderer::   7%|▋         | 9/136 [00:01<00:11, 11.30it/s]
Face Renderer::   8%|▊         | 11/136 [00:01<00:09, 13.13it/s]
Face Renderer::  10%|▉         | 13/136 [00:01<00:08, 14.55it/s]
Face Renderer::  11%|█         | 15/136 [00:01<00:07, 15.70it/s]
Face Renderer::  12%|█▎        | 17/136 [00:01<00:07, 16.53it/s]
Face Renderer::  14%|█▍        | 19/136 [00:01<00:06, 17.15it/s]
Face Renderer::  15%|█▌        | 21/136 [00:01<00:06, 17.59it/s]
Face Renderer::  17%|█▋        | 23/136 [00:01<00:06, 17.89it/s]
Face Renderer::  18%|█▊        | 25/136 [00:02<00:06, 18.06it/s]
Face Renderer::  20%|█▉        | 27/136 [00:02<00:05, 18.31it/s]
Face Renderer::  21%|██▏       | 29/136 [00:02<00:05, 18.44it/s]
Face Renderer::  23%|██▎       | 31/136 [00:02<00:05, 18.53it/s]
Face Renderer::  24%|██▍       | 33/136 [00:02<00:05, 18.61it/s]
Face Renderer::  26%|██▌       | 35/136 [00:02<00:05, 18.59it/s]
Face Renderer::  27%|██▋       | 37/136 [00:02<00:05, 18.63it/s]
Face Renderer::  29%|██▊       | 39/136 [00:02<00:05, 18.51it/s]
Face Renderer::  30%|███       | 41/136 [00:02<00:05, 18.59it/s]
Face Renderer::  32%|███▏      | 43/136 [00:03<00:04, 18.64it/s]
Face Renderer::  33%|███▎      | 45/136 [00:03<00:04, 18.68it/s]
Face Renderer::  35%|███▍      | 47/136 [00:03<00:04, 18.70it/s]
Face Renderer::  36%|███▌      | 49/136 [00:03<00:04, 18.70it/s]
Face Renderer::  38%|███▊      | 51/136 [00:03<00:04, 18.69it/s]
Face Renderer::  39%|███▉      | 53/136 [00:03<00:04, 18.72it/s]
Face Renderer::  40%|████      | 55/136 [00:03<00:04, 18.51it/s]
Face Renderer::  42%|████▏     | 57/136 [00:03<00:04, 18.68it/s]
Face Renderer::  43%|████▎     | 59/136 [00:03<00:04, 18.66it/s]
Face Renderer::  45%|████▍     | 61/136 [00:04<00:04, 18.59it/s]
Face Renderer::  46%|████▋     | 63/136 [00:04<00:03, 18.57it/s]
Face Renderer::  48%|████▊     | 65/136 [00:04<00:03, 18.64it/s]
Face Renderer::  49%|████▉     | 67/136 [00:04<00:03, 18.65it/s]
Face Renderer::  51%|█████     | 69/136 [00:04<00:03, 18.69it/s]
Face Renderer::  52%|█████▏    | 71/136 [00:04<00:03, 18.68it/s]
Face Renderer::  54%|█████▎    | 73/136 [00:04<00:03, 18.71it/s]
Face Renderer::  55%|█████▌    | 75/136 [00:04<00:03, 18.73it/s]
Face Renderer::  57%|█████▋    | 77/136 [00:04<00:03, 18.68it/s]
Face Renderer::  58%|█████▊    | 79/136 [00:04<00:03, 18.69it/s]
Face Renderer::  60%|█████▉    | 81/136 [00:05<00:02, 18.72it/s]
Face Renderer::  61%|██████    | 83/136 [00:05<00:02, 18.76it/s]
Face Renderer::  62%|██████▎   | 85/136 [00:05<00:02, 18.73it/s]
Face Renderer::  64%|██████▍   | 87/136 [00:05<00:02, 18.49it/s]
Face Renderer::  65%|██████▌   | 89/136 [00:05<00:02, 18.53it/s]
Face Renderer::  67%|██████▋   | 91/136 [00:05<00:02, 18.55it/s]
Face Renderer::  68%|██████▊   | 93/136 [00:05<00:02, 18.63it/s]
Face Renderer::  70%|██████▉   | 95/136 [00:05<00:02, 18.68it/s]
Face Renderer::  71%|███████▏  | 97/136 [00:05<00:02, 18.70it/s]
Face Renderer::  73%|███████▎  | 99/136 [00:06<00:01, 18.72it/s]
Face Renderer::  74%|███████▍  | 101/136 [00:06<00:01, 18.71it/s]
Face Renderer::  76%|███████▌  | 103/136 [00:06<00:01, 18.77it/s]
Face Renderer::  77%|███████▋  | 105/136 [00:06<00:01, 18.74it/s]
Face Renderer::  79%|███████▊  | 107/136 [00:06<00:01, 18.74it/s]
Face Renderer::  80%|████████  | 109/136 [00:06<00:01, 18.62it/s]
Face Renderer::  82%|████████▏ | 111/136 [00:06<00:01, 18.65it/s]
Face Renderer::  83%|████████▎ | 113/136 [00:06<00:01, 18.44it/s]
Face Renderer::  85%|████████▍ | 115/136 [00:06<00:01, 18.46it/s]
Face Renderer::  86%|████████▌ | 117/136 [00:07<00:01, 18.53it/s]
Face Renderer::  88%|████████▊ | 119/136 [00:07<00:00, 18.59it/s]
Face Renderer::  89%|████████▉ | 121/136 [00:07<00:00, 18.65it/s]
Face Renderer::  90%|█████████ | 123/136 [00:07<00:00, 18.64it/s]
Face Renderer::  92%|█████████▏| 125/136 [00:07<00:00, 18.63it/s]
Face Renderer::  93%|█████████▎| 127/136 [00:07<00:00, 18.58it/s]
Face Renderer::  95%|█████████▍| 129/136 [00:07<00:00, 18.53it/s]
Face Renderer::  96%|█████████▋| 131/136 [00:07<00:00, 18.54it/s]
Face Renderer::  98%|█████████▊| 133/136 [00:07<00:00, 18.63it/s]
Face Renderer::  99%|█████████▉| 135/136 [00:07<00:00, 18.68it/s]
Face Renderer::  99%|█████████▉| 135/136 [00:07<00:00, 16.89it/s]
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'results/2024_01_09_11.01.46/temp_tmpiuycqk491##tmpy824i3jtchinese_poem1.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.29.100
Duration: 00:00:05.40, start: 0.000000, bitrate: 69 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 256x256, 66 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
handler_name    : VideoHandler
vendor_id       : [0][0][0][0]
Guessed Channel Layout for Input Stream #1.0 : mono
Input #1, wav, from 'results/2024_01_09_11.01.46/tmpy824i3jtchinese_poem1.wav':
Duration: 00:00:05.40, bitrate: 256 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
Output #0, mp4, to 'bab057ca-1797-4081-af8f-efd4d5d799ca.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.76.100
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 256x256, q=2-31, 66 kb/s, 25 fps, 25 tbr, 12800 tbn, 12800 tbc (default)
Metadata:
handler_name    : VideoHandler
vendor_id       : [0][0][0][0]
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 16000 Hz, mono, fltp, 69 kb/s
Metadata:
encoder         : Lavc58.134.100 aac
frame=    0 fps=0.0 q=-1.0 size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x
frame=  135 fps=0.0 q=-1.0 Lsize=      92kB time=00:00:05.37 bitrate= 140.8kbits/s speed= 308x
video:44kB audio:44kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 5.446096%
[aac @ 0x5563e2ed36c0] Qavg: 42759.586
The generated video is named tmpiuycqk491##tmpy824i3jtchinese_poem1.mp4 in results/2024_01_09_11.01.46
OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'
faceClone::   0%|          | 0/135 [00:00<?, ?it/s]
faceClone::   1%|          | 1/135 [00:01<03:09,  1.41s/it]
faceClone::   1%|▏         | 2/135 [00:01<01:45,  1.27it/s]
faceClone::   2%|▏         | 3/135 [00:02<01:18,  1.68it/s]
faceClone::   3%|▎         | 4/135 [00:02<01:05,  2.00it/s]
faceClone::   4%|▎         | 5/135 [00:02<00:59,  2.20it/s]
faceClone::   4%|▍         | 6/135 [00:03<00:55,  2.34it/s]
faceClone::   5%|▌         | 7/135 [00:03<00:52,  2.44it/s]
faceClone::   6%|▌         | 8/135 [00:03<00:50,  2.53it/s]
faceClone::   7%|▋         | 9/135 [00:04<00:48,  2.59it/s]
faceClone::   7%|▋         | 10/135 [00:04<00:47,  2.61it/s]
faceClone::   8%|▊         | 11/135 [00:05<00:46,  2.66it/s]
faceClone::   9%|▉         | 12/135 [00:05<00:45,  2.69it/s]
faceClone::  10%|▉         | 13/135 [00:05<00:44,  2.71it/s]
faceClone::  10%|█         | 14/135 [00:06<00:44,  2.74it/s]
faceClone::  11%|█         | 15/135 [00:06<00:44,  2.71it/s]
faceClone::  12%|█▏        | 16/135 [00:06<00:44,  2.70it/s]
faceClone::  13%|█▎        | 17/135 [00:07<00:43,  2.69it/s]
faceClone::  13%|█▎        | 18/135 [00:07<00:43,  2.71it/s]
faceClone::  14%|█▍        | 19/135 [00:08<00:42,  2.73it/s]
faceClone::  15%|█▍        | 20/135 [00:08<00:42,  2.73it/s]
faceClone::  16%|█▌        | 21/135 [00:08<00:43,  2.61it/s]
faceClone::  16%|█▋        | 22/135 [00:09<00:44,  2.51it/s]
faceClone::  17%|█▋        | 23/135 [00:09<00:44,  2.53it/s]
faceClone::  18%|█▊        | 24/135 [00:09<00:42,  2.62it/s]
faceClone::  19%|█▊        | 25/135 [00:10<00:41,  2.63it/s]
faceClone::  19%|█▉        | 26/135 [00:10<00:41,  2.63it/s]
faceClone::  20%|██        | 27/135 [00:11<00:40,  2.65it/s]
faceClone::  21%|██        | 28/135 [00:11<00:40,  2.65it/s]
faceClone::  21%|██▏       | 29/135 [00:11<00:40,  2.65it/s]
faceClone::  22%|██▏       | 30/135 [00:12<00:39,  2.66it/s]
faceClone::  23%|██▎       | 31/135 [00:12<00:38,  2.67it/s]
faceClone::  24%|██▎       | 32/135 [00:12<00:38,  2.70it/s]
faceClone::  24%|██▍       | 33/135 [00:13<00:37,  2.72it/s]
faceClone::  25%|██▌       | 34/135 [00:13<00:36,  2.74it/s]
faceClone::  26%|██▌       | 35/135 [00:14<00:36,  2.75it/s]
faceClone::  27%|██▋       | 36/135 [00:14<00:36,  2.74it/s]
faceClone::  27%|██▋       | 37/135 [00:14<00:35,  2.72it/s]
faceClone::  28%|██▊       | 38/135 [00:15<00:35,  2.70it/s]
faceClone::  29%|██▉       | 39/135 [00:15<00:35,  2.68it/s]
faceClone::  30%|██▉       | 40/135 [00:15<00:35,  2.66it/s]
faceClone::  30%|███       | 41/135 [00:16<00:35,  2.67it/s]
faceClone::  31%|███       | 42/135 [00:16<00:35,  2.66it/s]
faceClone::  32%|███▏      | 43/135 [00:17<00:34,  2.68it/s]
faceClone::  33%|███▎      | 44/135 [00:17<00:33,  2.70it/s]
faceClone::  33%|███▎      | 45/135 [00:17<00:33,  2.70it/s]
faceClone::  34%|███▍      | 46/135 [00:18<00:32,  2.70it/s]
faceClone::  35%|███▍      | 47/135 [00:18<00:33,  2.63it/s]
faceClone::  36%|███▌      | 48/135 [00:18<00:34,  2.53it/s]
faceClone::  36%|███▋      | 49/135 [00:19<00:33,  2.54it/s]
faceClone::  37%|███▋      | 50/135 [00:19<00:32,  2.59it/s]
faceClone::  38%|███▊      | 51/135 [00:20<00:32,  2.61it/s]
faceClone::  39%|███▊      | 52/135 [00:20<00:31,  2.62it/s]
faceClone::  39%|███▉      | 53/135 [00:20<00:31,  2.61it/s]
faceClone::  40%|████      | 54/135 [00:21<00:30,  2.64it/s]
faceClone::  41%|████      | 55/135 [00:21<00:30,  2.66it/s]
faceClone::  41%|████▏     | 56/135 [00:21<00:29,  2.63it/s]
faceClone::  42%|████▏     | 57/135 [00:22<00:29,  2.64it/s]
faceClone::  43%|████▎     | 58/135 [00:22<00:29,  2.64it/s]
faceClone::  44%|████▎     | 59/135 [00:23<00:28,  2.63it/s]
faceClone::  44%|████▍     | 60/135 [00:23<00:29,  2.58it/s]
faceClone::  45%|████▌     | 61/135 [00:23<00:29,  2.54it/s]
faceClone::  46%|████▌     | 62/135 [00:24<00:28,  2.55it/s]
faceClone::  47%|████▋     | 63/135 [00:24<00:28,  2.52it/s]
faceClone::  47%|████▋     | 64/135 [00:25<00:28,  2.49it/s]
faceClone::  48%|████▊     | 65/135 [00:25<00:29,  2.41it/s]
faceClone::  49%|████▉     | 66/135 [00:26<00:28,  2.38it/s]
faceClone::  50%|████▉     | 67/135 [00:26<00:28,  2.38it/s]
faceClone::  50%|█████     | 68/135 [00:26<00:27,  2.39it/s]
faceClone::  51%|█████     | 69/135 [00:27<00:27,  2.41it/s]
faceClone::  52%|█████▏    | 70/135 [00:27<00:27,  2.40it/s]
faceClone::  53%|█████▎    | 71/135 [00:28<00:25,  2.47it/s]
faceClone::  53%|█████▎    | 72/135 [00:28<00:24,  2.52it/s]
faceClone::  54%|█████▍    | 73/135 [00:28<00:24,  2.55it/s]
faceClone::  55%|█████▍    | 74/135 [00:29<00:23,  2.57it/s]
faceClone::  56%|█████▌    | 75/135 [00:29<00:23,  2.60it/s]
faceClone::  56%|█████▋    | 76/135 [00:29<00:22,  2.62it/s]
faceClone::  57%|█████▋    | 77/135 [00:30<00:23,  2.51it/s]
faceClone::  58%|█████▊    | 78/135 [00:30<00:23,  2.47it/s]
faceClone::  59%|█████▊    | 79/135 [00:31<00:22,  2.47it/s]
faceClone::  59%|█████▉    | 80/135 [00:31<00:21,  2.53it/s]
faceClone::  60%|██████    | 81/135 [00:31<00:21,  2.54it/s]
faceClone::  61%|██████    | 82/135 [00:32<00:20,  2.58it/s]
faceClone::  61%|██████▏   | 83/135 [00:32<00:20,  2.51it/s]
faceClone::  62%|██████▏   | 84/135 [00:33<00:20,  2.43it/s]
faceClone::  63%|██████▎   | 85/135 [00:33<00:20,  2.43it/s]
faceClone::  64%|██████▎   | 86/135 [00:34<00:20,  2.40it/s]
faceClone::  64%|██████▍   | 87/135 [00:34<00:20,  2.38it/s]
faceClone::  65%|██████▌   | 88/135 [00:34<00:19,  2.37it/s]
faceClone::  66%|██████▌   | 89/135 [00:35<00:19,  2.35it/s]
faceClone::  67%|██████▋   | 90/135 [00:35<00:19,  2.35it/s]
faceClone::  67%|██████▋   | 91/135 [00:36<00:18,  2.44it/s]
faceClone::  68%|██████▊   | 92/135 [00:36<00:17,  2.50it/s]
faceClone::  69%|██████▉   | 93/135 [00:36<00:16,  2.53it/s]
faceClone::  70%|██████▉   | 94/135 [00:37<00:15,  2.57it/s]
faceClone::  70%|███████   | 95/135 [00:37<00:15,  2.58it/s]
faceClone::  71%|███████   | 96/135 [00:38<00:15,  2.58it/s]
faceClone::  72%|███████▏  | 97/135 [00:38<00:14,  2.60it/s]
faceClone::  73%|███████▎  | 98/135 [00:38<00:14,  2.62it/s]
faceClone::  73%|███████▎  | 99/135 [00:39<00:13,  2.63it/s]
faceClone::  74%|███████▍  | 100/135 [00:39<00:13,  2.61it/s]
faceClone::  75%|███████▍  | 101/135 [00:39<00:12,  2.63it/s]
faceClone::  76%|███████▌  | 102/135 [00:40<00:12,  2.64it/s]
faceClone::  76%|███████▋  | 103/135 [00:40<00:12,  2.64it/s]
faceClone::  77%|███████▋  | 104/135 [00:41<00:11,  2.66it/s]
faceClone::  78%|███████▊  | 105/135 [00:41<00:11,  2.69it/s]
faceClone::  79%|███████▊  | 106/135 [00:41<00:10,  2.67it/s]
faceClone::  79%|███████▉  | 107/135 [00:42<00:10,  2.69it/s]
faceClone::  80%|████████  | 108/135 [00:42<00:10,  2.68it/s]
faceClone::  81%|████████  | 109/135 [00:42<00:09,  2.70it/s]
faceClone::  81%|████████▏ | 110/135 [00:43<00:09,  2.67it/s]
faceClone::  82%|████████▏ | 111/135 [00:43<00:08,  2.69it/s]
faceClone::  83%|████████▎ | 112/135 [00:44<00:08,  2.66it/s]
faceClone::  84%|████████▎ | 113/135 [00:44<00:08,  2.66it/s]
faceClone::  84%|████████▍ | 114/135 [00:44<00:07,  2.66it/s]
faceClone::  85%|████████▌ | 115/135 [00:45<00:07,  2.68it/s]
faceClone::  86%|████████▌ | 116/135 [00:45<00:07,  2.64it/s]
faceClone::  87%|████████▋ | 117/135 [00:45<00:06,  2.64it/s]
faceClone::  87%|████████▋ | 118/135 [00:46<00:06,  2.66it/s]
faceClone::  88%|████████▊ | 119/135 [00:46<00:05,  2.69it/s]
faceClone::  89%|████████▉ | 120/135 [00:47<00:05,  2.70it/s]
faceClone::  90%|████████▉ | 121/135 [00:47<00:05,  2.71it/s]
faceClone::  90%|█████████ | 122/135 [00:47<00:04,  2.71it/s]
faceClone::  91%|█████████ | 123/135 [00:48<00:04,  2.69it/s]
faceClone::  92%|█████████▏| 124/135 [00:48<00:04,  2.71it/s]
faceClone::  93%|█████████▎| 125/135 [00:48<00:03,  2.69it/s]
faceClone::  93%|█████████▎| 126/135 [00:49<00:03,  2.70it/s]
faceClone::  94%|█████████▍| 127/135 [00:49<00:02,  2.71it/s]
faceClone::  95%|█████████▍| 128/135 [00:50<00:02,  2.71it/s]
faceClone::  96%|█████████▌| 129/135 [00:50<00:02,  2.69it/s]
faceClone::  96%|█████████▋| 130/135 [00:50<00:01,  2.69it/s]
faceClone::  97%|█████████▋| 131/135 [00:51<00:01,  2.70it/s]
faceClone::  98%|█████████▊| 132/135 [00:51<00:01,  2.67it/s]
faceClone::  99%|█████████▊| 133/135 [00:51<00:00,  2.68it/s]
faceClone::  99%|█████████▉| 134/135 [00:52<00:00,  2.64it/s]
faceClone:: 100%|██████████| 135/135 [00:52<00:00,  2.64it/s]
faceClone:: 100%|██████████| 135/135 [00:52<00:00,  2.56it/s]
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '4ce1cfac-74fa-4bbd-87f5-95ff48f17cfe.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2mp41
encoder         : Lavf59.27.100
Duration: 00:00:05.40, start: 0.000000, bitrate: 600 kb/s
Stream #0:0(und): Video: mpeg4 (Simple Profile) (mp4v / 0x7634706D), yuv420p, 700x700 [SAR 1:1 DAR 1:1], 598 kb/s, 25 fps, 25 tbr, 12800 tbn, 25 tbc (default)
Metadata:
handler_name    : VideoHandler
vendor_id       : [0][0][0][0]
Guessed Channel Layout for Input Stream #1.0 : mono
Input #1, wav, from 'results/2024_01_09_11.01.46/tmpy824i3jtchinese_poem1.wav':
Duration: 00:00:05.40, bitrate: 256 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (mpeg4 (native) -> h264 (libx264))
Stream #1:0 -> #0:1 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x55c33629d900] using SAR=1/1
[libx264 @ 0x55c33629d900] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x55c33629d900] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 0x55c33629d900] 264 - core 163 r3060 5db6aa6 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=15 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'results/2024_01_09_11.01.46/tmpiuycqk491##tmpy824i3jtchinese_poem1_full.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2mp41
encoder         : Lavf58.76.100
Stream #0:0(und): Video: h264 (avc1 / 0x31637661), yuv420p(progressive), 700x700 [SAR 1:1 DAR 1:1], q=2-31, 25 fps, 12800 tbn (default)
Metadata:
handler_name    : VideoHandler
vendor_id       : [0][0][0][0]
encoder         : Lavc58.134.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 16000 Hz, mono, fltp, 69 kb/s
Metadata:
encoder         : Lavc58.134.100 aac
frame=    1 fps=0.0 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x
frame=  135 fps=0.0 q=-1.0 Lsize=     281kB time=00:00:05.37 bitrate= 428.6kbits/s speed=11.2x
video:233kB audio:44kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.691295%
[libx264 @ 0x55c33629d900] frame I:1     Avg QP:18.57  size: 11753
[libx264 @ 0x55c33629d900] frame P:58    Avg QP:20.29  size:  2937
[libx264 @ 0x55c33629d900] frame B:76    Avg QP:22.27  size:   728
[libx264 @ 0x55c33629d900] consecutive B-frames: 15.6% 26.7%  4.4% 53.3%
[libx264 @ 0x55c33629d900] mb I  I16..4: 29.2% 70.1%  0.6%
[libx264 @ 0x55c33629d900] mb P  I16..4:  1.3%  4.8%  0.1%  P16..4: 27.3%  5.9%  2.8%  0.0%  0.0%    skip:57.8%
[libx264 @ 0x55c33629d900] mb B  I16..4:  0.3%  1.0%  0.0%  B16..8: 25.6%  0.6%  0.0%  direct: 0.3%  skip:72.1%  L0:48.8% L1:50.5% BI: 0.7%
[libx264 @ 0x55c33629d900] 8x8 transform intra:76.1% inter:89.4%
[libx264 @ 0x55c33629d900] coded y,uvDC,uvAC intra: 34.0% 37.2% 3.7% inter: 3.5% 4.8% 0.0%
[libx264 @ 0x55c33629d900] i16 v,h,dc,p: 40% 38% 15%  7%
[libx264 @ 0x55c33629d900] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 24% 40%  1%  1%  1%  1%  1%  1%
[libx264 @ 0x55c33629d900] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 37% 29% 15%  2%  4%  3%  6%  2%  1%
[libx264 @ 0x55c33629d900] i8c dc,h,v,p: 52% 24% 22%  2%
[libx264 @ 0x55c33629d900] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x55c33629d900] ref P L0: 70.2%  6.3% 17.0%  6.5%
[libx264 @ 0x55c33629d900] ref B L0: 79.2% 17.1%  3.7%
[libx264 @ 0x55c33629d900] ref B L1: 96.0%  4.0%
[libx264 @ 0x55c33629d900] kb/s:351.74
[aac @ 0x55c33629f8c0] Qavg: 42759.586
The generated video is named results/2024_01_09_11.01.46/tmpiuycqk491##tmpy824i3jtchinese_poem1_full.mp4
Version Details
Version ID
9804d1827f20dbc994c64eac2ac2fd835670209ec8a3dc3f2429c2d847414b11
Version Created
January 9, 2024
Run on Replicate →