lightricks/audio-to-video 🖼️📝🔢 → 🖼️

⭐ Official ▶️ 707 runs 📅 Jan 2026 ⚙️ Cog 0.16.9
audio-to-video image-to-video lipsync

About

Use audio input with an image or prompt to generate videos

Example Output

Prompt:

"a woman speaks the words. her mouth moves up and down with the cadence of the words to make it look like it is speaking the words."

Output

Performance Metrics

35.71s Prediction Time
35.73s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/OUCNclpatd8lrBLaHbYZFRoOHKRgpkNyzL9MyjB3qdabxERb/Chatterbox%20Text%20to%20Speech.mp3",
  "prompt": "a woman speaks the words. her mouth moves up and down with the cadence of the words to make it look like it is speaking the words.",
  "guidance_scale": 16.88
}
Input Parameters
audio (required) Type: string
Audio file to be used as the soundtrack for the video. Supported formats: wav, mp3, flac, ogg, m4a.
image Type: string
Input image to be used as the first frame of the video. Required if prompt is not provided.
prompt Type: stringDefault:
Text description of how the video should be generated. Required if image is not provided. If image is provided, this describes how the image should be animated.
guidance_scale Type: numberRange: 1 - 50
Guidance scale (CFG) for video generation. Higher values make the output more closely follow the prompt but may reduce quality.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Audio duration: 8.83s
Generating video from audio...
Generated audio-to-video in 35.0sec
Generated video: 1920x1080, 8.68s, 449,971,200 total pixels
Version Details
Version ID
208e8ab75e27c6927a276028436658e37683f6471da95a18facfcc539c92acf1
Version Created
January 27, 2026
Run on Replicate →