bytedance/seedance-1.5-pro ❓🔢🖼️📝✓ → 🖼️

⭐ Official ▶️ 98.6K runs 📅 Dec 2025 ⚙️ Cog 0.16.9
image-to-video-with-audio lipsync text-to-video-with-audio

About

A joint audio-video model that accurately follows complex instructions.

Example Output

Prompt:

"A young astronaut in a worn spacesuit sits in the dim cockpit of a spacecraft. The helmet visor is covered with fog and scratches, and the control panel flickers with orange-yellow lights, creating a tense and lonely atmosphere. The video begins with this static opening frame. The camera then rapidly zooms into the astronaut’s face before cutting to the exterior, revealing the spacecraft racing through a blizzard-like storm of cosmic debris. Sci-fi thriller style. Background music: low electronic synthesizers paired with rapidly swelling strings to build suspense. Sound effects: urgent engine hums and howling space-storm noise. Dialogue: "In the void of space, one wrong move..." followed by a brief silence, ending with: "Mayday... systems failing.""

Output

Performance Metrics

60.24s Prediction Time
60.28s Total Time
All Input Parameters
{
  "fps": 24,
  "prompt": "A young astronaut in a worn spacesuit sits in the dim cockpit of a spacecraft. The helmet visor is covered with fog and scratches, and the control panel flickers with orange-yellow lights, creating a tense and lonely atmosphere. The video begins with this static opening frame. The camera then rapidly zooms into the astronaut’s face before cutting to the exterior, revealing the spacecraft racing through a blizzard-like storm of cosmic debris. Sci-fi thriller style. Background music: low electronic synthesizers paired with rapidly swelling strings to build suspense. Sound effects: urgent engine hums and howling space-storm noise. Dialogue: \"In the void of space, one wrong move...\" followed by a brief silence, ending with: \"Mayday... systems failing.\"",
  "duration": 5,
  "aspect_ratio": "16:9",
  "camera_fixed": false,
  "generate_audio": true
}
Input Parameters
fps Default: 24
Frame rate (frames per second)
seed Type: integer
Random seed. Set for reproducible generation
image Type: string
Input image for image-to-video generation
prompt (required) Type: string
Text prompt for video generation
duration Type: integerDefault: 5Range: 2 - 12
Video duration in seconds
aspect_ratio Default: 16:9
Video aspect ratio. Ignored if an image is used.
camera_fixed Type: booleanDefault: false
Whether to fix camera position
generate_audio Type: booleanDefault: false
Generate audio synchronized with the video. When enabled, the model outputs a video with audio that matches the visuals.
last_frame_image Type: string
Input image for last frame generation. This only works if an image start frame is given too.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 1040530827
Generating video...
Generated video in 55.0sec
Downloading 6540671 bytes
Downloaded 6.24MB in 4.77sec
Version Details
Version ID
78c1986fecf4df185593bbf148a0ed5b4b18b7e0e0f34cc3e32c68cdfa9536ba
Version Created
December 23, 2025
Run on Replicate →