bytedance/omni-human 🖼️ → 🖼️

⭐ Official ▶️ 141.1K runs 📅 Jul 2025 ⚙️ Cog 0.16.7
audio-to-video image-to-video-with-audio lipsync

About

Turns your audio/video/images into professional-quality animated videos

Example Output

Output

Performance Metrics

157.05s Prediction Time
157.06s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/NSVCaWKzBxsoYSGGDvXNbyYAANdTI1LtQPGpHxyKTCxtaMo3/ElevenLabs_2025-08-01T09_30_34_Laura_pre_sp100_s50_sb75_v3.mp3",
  "image": "https://replicate.delivery/pbxt/NSVCZtHBKfQIQnoDEhoehnlyvDxrbZIyN4bUlpB4YiKCzl5e/image.png"
}
Input Parameters
audio (required) Type: string
Input audio file (MP3, WAV, etc.). For the best quality outputs audio should be no longer than 15 seconds. After 15 seconds the video quality will begin to degrade. If you have a lot of audio you want to process, we recommend splitting it into 15 second chunks.
image (required) Type: string
Input image containing a human subject, face or character.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Checking for human subject...
Generating video...
Generated video in 153.5sec
Downloading 2384667 bytes
Downloaded 2.27MB in 2.82sec
Version Details
Version ID
1797f102db630bd6a6fae7b5d6559929657bb4b1c1616486d739178c1807c90f
Version Created
September 19, 2025
Run on Replicate →