lightricks/audio-to-video 🖼️📝🔢 → 🖼️
About
Use audio input with an image or prompt to generate videos
Example Output
Prompt:
"a woman speaks the words. her mouth moves up and down with the cadence of the words to make it look like it is speaking the words."
Output
Performance Metrics
35.71s
Prediction Time
35.73s
Total Time
All Input Parameters
{
"audio": "https://replicate.delivery/pbxt/OUCNclpatd8lrBLaHbYZFRoOHKRgpkNyzL9MyjB3qdabxERb/Chatterbox%20Text%20to%20Speech.mp3",
"prompt": "a woman speaks the words. her mouth moves up and down with the cadence of the words to make it look like it is speaking the words.",
"guidance_scale": 16.88
}
Input Parameters
- audio (required)
- Audio file to be used as the soundtrack for the video. Supported formats: wav, mp3, flac, ogg, m4a.
- image
- Input image to be used as the first frame of the video. Required if prompt is not provided.
- prompt
- Text description of how the video should be generated. Required if image is not provided. If image is provided, this describes how the image should be animated.
- guidance_scale
- Guidance scale (CFG) for video generation. Higher values make the output more closely follow the prompt but may reduce quality.
Output Schema
Output
Example Execution Logs
Audio duration: 8.83s Generating video from audio... Generated audio-to-video in 35.0sec Generated video: 1920x1080, 8.68s, 449,971,200 total pixels
Version Details
- Version ID
208e8ab75e27c6927a276028436658e37683f6471da95a18facfcc539c92acf1- Version Created
- January 27, 2026