vidu/q3-pro 🔢✓📝🖼️❓ → 🖼️
About
High-fidelity video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.
Example Output
Prompt:
"A rabbit warrior slowly raises his sharp sword, winks at the camera, and is captured with a Hitchcock zoom effect."
Output
Performance Metrics
238.21s
Prediction Time
238.24s
Total Time
All Input Parameters
{
"audio": true,
"prompt": "A rabbit warrior slowly raises his sharp sword, winks at the camera, and is captured with a Hitchcock zoom effect.",
"duration": 5,
"resolution": "720p",
"aspect_ratio": "16:9"
}
Input Parameters
- seed
- Random seed. Set for reproducible generation.
- audio
- Whether to generate audio synchronized with the video (dialogue and sound effects).
- prompt (required)
- Text prompt for video generation. Maximum 5000 characters.
- duration
- Duration of the video in seconds.
- end_image
- End frame image for the video. Must be used together with start_image for start-end-to-video mode. The aspect ratios of start and end images must be similar (ratio between 0.8 and 1.25). Supported formats: png, jpeg, jpg, webp.
- resolution
- Resolution of the output video.
- start_image
- Start frame image for the video. When provided without an end_image, the model runs in image-to-video mode. Supported formats: png, jpeg, jpg, webp.
- aspect_ratio
- Aspect ratio of the output video. Only used in text-to-video mode (ignored when images are provided).
Output Schema
Output
Example Execution Logs
Task created: 927358055639699456 Video generated in 235.5sec Downloading 4020469 bytes Downloaded 3.83MB in 2.25sec Output video duration: 5.042s
Version Details
- Version ID
1a8e2767ffcf46d64a4d97727f187ccc2cadc59db844db7167327e01a2a79390- Version Created
- April 12, 2026