wan-video/wan-2.2-s2v 🔢🖼️📝 → 🖼️

⭐ Official ▶️ 3.4K runs 📅 Sep 2025 ⚙️ Cog 0.16.0 📄 Paper ⚖️ License
audio-to-video image-to-video lipsync

About

Generate a video from an audio clip and a reference image

Example Output

Prompt:

"woman singing"

Output

Performance Metrics

111.54s Prediction Time
111.56s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/NgrFdGZvxAfqCxi5cKn6EVyOfaW2WbMnK0X6xzUcjx7bjvNT/replicate-prediction-p490m1zkhsrm80cs6myvm1atbm.mp3",
  "image": "https://replicate.delivery/pbxt/Ngr8dvy1hz1VmBdYWNedHrKTLaSh4OYYJc2nOpDb4Cj17T9O/replicate-prediction-q9nq4pzxa1rma0cs6my8r8507c.jpg",
  "prompt": "woman singing",
  "interpolate": false,
  "num_frames_per_chunk": 81
}
Input Parameters
seed Type: integer
Random seed. Leave blank for random
audio (required) Type: string
Audio file to synchronize the video with
image (required) Type: string
First frame image to start the video from
prompt (required) Type: string
Prompt for video generation
num_frames_per_chunk Type: integerDefault: 81Range: 1 - 121
Number of frames per video chunk
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Generating video with dimensions: 768x768
Generating video...
Merging audio with video...
INFO:root:Start merging video and audio...
INFO:root:Merge completed, saved to /tmp/tmp6913qylb/output.mp4
Video generation completed: /tmp/tmp6913qylb/output.mp4
/root/.pyenv/versions/3.11.13/lib/python3.11/site-packages/cog/server/scope.py:22: ExperimentalFeatureWarning: current_scope is an experimental internal function. It may change or be removed without warning.
warnings.warn(
Version Details
Version ID
09607e6e761d2f015b0d740f938ec59199f54aa623384465a5054b230405acf4
Version Created
September 12, 2025
Run on Replicate →