wan-video/wan-2.2-s2v 🔢🖼️📝 → 🖼️

⭐ Official ▶️ 3.4K runs 📅 Sep 2025 ⚙️ Cog 0.16.0 📄 Paper ⚖️ License

audio-to-video image-to-video lipsync

About

Generate a video from an audio clip and a reference image

Example Output

Prompt:

"woman singing"

Output

Performance Metrics

111.54s Prediction Time

111.56s Total Time

All Input Parameters

{
  "audio": "https://replicate.delivery/pbxt/NgrFdGZvxAfqCxi5cKn6EVyOfaW2WbMnK0X6xzUcjx7bjvNT/replicate-prediction-p490m1zkhsrm80cs6myvm1atbm.mp3",
  "image": "https://replicate.delivery/pbxt/Ngr8dvy1hz1VmBdYWNedHrKTLaSh4OYYJc2nOpDb4Cj17T9O/replicate-prediction-q9nq4pzxa1rma0cs6my8r8507c.jpg",
  "prompt": "woman singing",
  "interpolate": false,
  "num_frames_per_chunk": 81
}

Input Parameters

seed Type: integer: Random seed. Leave blank for random
audio (required) Type: string: Audio file to synchronize the video with
image (required) Type: string: First frame image to start the video from
prompt (required) Type: string: Prompt for video generation
num_frames_per_chunk Type: integerDefault: 81Range: 1 - 121: Number of frames per video chunk

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Generating video with dimensions: 768x768
Generating video...
Merging audio with video...
INFO:root:Start merging video and audio...
INFO:root:Merge completed, saved to /tmp/tmp6913qylb/output.mp4
Video generation completed: /tmp/tmp6913qylb/output.mp4
/root/.pyenv/versions/3.11.13/lib/python3.11/site-packages/cog/server/scope.py:22: ExperimentalFeatureWarning: current_scope is an experimental internal function. It may change or be removed without warning.
warnings.warn(

Version Details

Version ID: 09607e6e761d2f015b0d740f938ec59199f54aa623384465a5054b230405acf4
Version Created: September 12, 2025

Run on Replicate →