bytedance/omni-human 🖼️ → 🖼️
About
Turns your audio/video/images into professional-quality animated videos
Example Output
Output
Performance Metrics
157.05s
Prediction Time
157.06s
Total Time
All Input Parameters
{ "audio": "https://replicate.delivery/pbxt/NSVCaWKzBxsoYSGGDvXNbyYAANdTI1LtQPGpHxyKTCxtaMo3/ElevenLabs_2025-08-01T09_30_34_Laura_pre_sp100_s50_sb75_v3.mp3", "image": "https://replicate.delivery/pbxt/NSVCZtHBKfQIQnoDEhoehnlyvDxrbZIyN4bUlpB4YiKCzl5e/image.png" }
Input Parameters
- audio (required)
- Input audio file (MP3, WAV, etc.). For the best quality outputs audio should be no longer than 15 seconds. After 15 seconds the video quality will begin to degrade. If you have a lot of audio you want to process, we recommend splitting it into 15 second chunks.
- image (required)
- Input image containing a human subject, face or character.
Output Schema
Output
Example Execution Logs
Checking for human subject... Generating video... Generated video in 153.5sec Downloading 2384667 bytes Downloaded 2.27MB in 2.82sec
Version Details
- Version ID
1797f102db630bd6a6fae7b5d6559929657bb4b1c1616486d739178c1807c90f
- Version Created
- September 19, 2025