bytedance/omni-human 🖼️ → 🖼️
About
Turns your audio/video/images into professional-quality animated videos
Example Output
Output
Performance Metrics
157.05s
Prediction Time
157.06s
Total Time
All Input Parameters
{
"audio": "https://replicate.delivery/pbxt/NSVCaWKzBxsoYSGGDvXNbyYAANdTI1LtQPGpHxyKTCxtaMo3/ElevenLabs_2025-08-01T09_30_34_Laura_pre_sp100_s50_sb75_v3.mp3",
"image": "https://replicate.delivery/pbxt/NSVCZtHBKfQIQnoDEhoehnlyvDxrbZIyN4bUlpB4YiKCzl5e/image.png"
}
Input Parameters
- audio (required)
- Input audio file (MP3, WAV, etc.). For the best quality outputs audio should be no longer than 15 seconds. After 15 seconds the video quality will begin to degrade. If you have a lot of audio you want to process, we recommend splitting it into 15 second chunks.
- image (required)
- Input image containing a human subject, face or character.
Output Schema
Output
Example Execution Logs
Checking for human subject... Generating video... Generated video in 153.5sec Downloading 2384667 bytes Downloaded 2.27MB in 2.82sec
Version Details
- Version ID
566f1b03016969ac39e242c1ae4a39034686ca8850fc3dba83dceaceb96f74b2- Version Created
- November 10, 2025