bytedance/omni-human 🖼️ → 🖼️

⭐ Official ▶️ 150.4K runs 📅 Jul 2025 ⚙️ Cog 0.16.8
audio-to-video image-to-video image-to-video-with-audio lipsync

About

Turns your audio/video/images into professional-quality animated videos

Example Output

Output

Performance Metrics

157.05s Prediction Time
157.06s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/NSVCaWKzBxsoYSGGDvXNbyYAANdTI1LtQPGpHxyKTCxtaMo3/ElevenLabs_2025-08-01T09_30_34_Laura_pre_sp100_s50_sb75_v3.mp3",
  "image": "https://replicate.delivery/pbxt/NSVCZtHBKfQIQnoDEhoehnlyvDxrbZIyN4bUlpB4YiKCzl5e/image.png"
}
Input Parameters
audio (required) Type: string
Input audio file (MP3, WAV, etc.). For the best quality outputs audio should be no longer than 15 seconds. After 15 seconds the video quality will begin to degrade. If you have a lot of audio you want to process, we recommend splitting it into 15 second chunks.
image (required) Type: string
Input image containing a human subject, face or character.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Checking for human subject...
Generating video...
Generated video in 153.5sec
Downloading 2384667 bytes
Downloaded 2.27MB in 2.82sec
Version Details
Version ID
566f1b03016969ac39e242c1ae4a39034686ca8850fc3dba83dceaceb96f74b2
Version Created
November 10, 2025
Run on Replicate →