camenduru/one-shot-talking-face 🖼️ → 🖼️
About
one-shot-talking-face-replicate

Example Output
Output
Performance Metrics
81.53s
Prediction Time
241.80s
Total Time
All Input Parameters
{ "wav_file": "https://replicate.delivery/pbxt/KKmxSv74MwXgpnBWkyEbett4FFk3DoQvt8YTIXgNWo2Tns6o/test_audio%20%281%29.wav", "image_file": "https://replicate.delivery/pbxt/KKmxTcthtU2XzfDwfZWkij0wE4cTNzopfcH2vYMNKFS9wgNG/image%20-%202024-02-02T181207.989.png" }
Input Parameters
- wav_file
- image_file (required)
- Input Image
Output Schema
Output
Example Execution Logs
ffmpeg version 5.1.4-0+deb12u1 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 12 (Debian 12.2.0-14) configuration: --prefix=/usr --extra-version=0+deb12u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared libavutil 57. 28.100 / 57. 28.100 libavcodec 59. 37.100 / 59. 37.100 libavformat 59. 27.100 / 59. 27.100 libavdevice 59. 7.100 / 59. 7.100 libavfilter 8. 44.100 / 8. 44.100 libswscale 6. 7.100 / 6. 7.100 libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100 Guessed Channel Layout for Input Stream #0.0 : mono Input #0, wav, from '/content/train/audio.wav': Duration: 00:00:12.67, bitrate: 384 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, mono, s16, 384 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help -async is forwarded to lavfi similarly to -af aresample=async=1:min_hard_comp=0.100000:first_pts=0. Output #0, wav, to '/content/one-shot-talking-face/samples/temp.wav': Metadata: ISFT : Lavf59.27.100 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s Metadata: encoder : Lavc59.37.100 pcm_s16le size= 3kB time=00:00:00.08 bitrate= 263.4kbits/s speed=N/A size= 396kB time=00:00:12.67 bitrate= 256.0kbits/s speed= 825x video:0kB audio:396kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.019235% /content/one-shot-talking-face/OpenFace/FeatureExtraction: error while loading shared libraries: libtiff.so.5: cannot open shared object file: No such file or directory /usr/local/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /usr/local/lib/python3.10/site-packages/torch/nn/functional.py:4227: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. warnings.warn( /usr/local/lib/python3.10/site-packages/torch/nn/functional.py:1967: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") save video to: /content/train/temp/image_audio.mp4 ffmpeg version 5.1.4-0+deb12u1 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 12 (Debian 12.2.0-14) configuration: --prefix=/usr --extra-version=0+deb12u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared libavutil 57. 28.100 / 57. 28.100 libavcodec 59. 37.100 / 59. 37.100 libavformat 59. 27.100 / 59. 27.100 libavdevice 59. 7.100 / 59. 7.100 libavfilter 8. 44.100 / 8. 44.100 libswscale 6. 7.100 / 6. 7.100 libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/train/temp/image_audio.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.29.100 Duration: 00:00:12.64, start: 0.000000, bitrate: 111 kb/s Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 256x256, 108 kb/s, 25 fps, 25 tbr, 12800 tbn (default) Metadata: handler_name : VideoHandler vendor_id : [0][0][0][0] Guessed Channel Layout for Input Stream #1.0 : mono Input #1, wav, from '/content/train/audio.wav': Duration: 00:00:12.67, bitrate: 384 kb/s Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, mono, s16, 384 kb/s Stream mapping: Stream #0:0 -> #0:0 (copy) Stream #1:0 -> #0:1 (pcm_s16le (native) -> aac (native)) Press [q] to stop, [?] for help Output #0, mp4, to '/content/train/image_audio.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf59.27.100 Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 256x256, q=2-31, 108 kb/s, 25 fps, 25 tbr, 12800 tbn (default) Metadata: handler_name : VideoHandler vendor_id : [0][0][0][0] Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 24000 Hz, mono, fltp, 69 kb/s Metadata: encoder : Lavc59.37.100 aac frame= 0 fps=0.0 q=-1.0 size= 0kB time=00:00:00.00 bitrate=N/A speed= 0x frame= 316 fps=0.0 q=-1.0 Lsize= 289kB time=00:00:12.67 bitrate= 186.6kbits/s speed=38.9x video:167kB audio:113kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.229118% [aac @ 0x5752574bbe00] Qavg: 597.293
Version Details
- Version ID
b9b9d4b827a5ce99e14e0804b594353a63dfb48ab214fc6d12c40c4e7ea00e41
- Version Created
- February 2, 2024