lucataco/ltx-video-0.9.8-distilled 🔢🖼️📝✓❓ → 🖼️

▶️ 82.9K runs 📅 Jul 2025 ⚙️ Cog 0.15.11

About

Generate native long-form video, with controllability

Example Output

Prompt:

"The turquoise waves crash against the dark, jagged rocks of the shore, sending white foam spraying into the air. The scene is dominated by the stark contrast between the bright blue water and the dark, almost black rocks. The water is a clear, turquoise color, and the waves are capped with white foam. The rocks are dark and jagged, and they are covered in patches of green moss. The shore is lined with lush green vegetation, including trees and bushes. In the background, there are rolling hills covered in dense forest. The sky is cloudy, and the light is dim"

Output

Performance Metrics

13.61s Prediction Time

95.24s Total Time

All Input Parameters

{
  "fps": 24,
  "prompt": "The turquoise waves crash against the dark, jagged rocks of the shore, sending white foam spraying into the air. The scene is dominated by the stark contrast between the bright blue water and the dark, almost black rocks. The water is a clear, turquoise color, and the waves are capped with white foam. The rocks are dark and jagged, and they are covered in patches of green moss. The shore is lined with lush green vegetation, including trees and bushes. In the background, there are rolling hills covered in dense forest. The sky is cloudy, and the light is dim",
  "go_fast": true,
  "num_frames": 121,
  "resolution": 480,
  "aspect_ratio": "match_input_image",
  "guidance_scale": 3,
  "negative_prompt": "worst quality, inconsistent motion, blurry, jittery, distorted",
  "denoise_strength": 0.4,
  "downscale_factor": 0.667,
  "conditioning_frames": 21,
  "num_inference_steps": 24,
  "max_duration_seconds": 5,
  "final_inference_steps": 10
}

Input Parameters

fps Type: integerDefault: 24Range: 1 - 60: Frames per second for the output video.
seed Type: integer: Random seed for reproducible results. Leave blank for a random seed.
image Type: string: Input image for image-to-video generation. If provided, will generate video from this image.
video Type: string: Input video for video-to-video generation. If provided, will generate video from this video. Takes precedence over image if both are provided.
prompt (required) Type: string: Text prompt for video generation
go_fast Type: booleanDefault: true: Enable fast mode with skip layer strategies (20-40% faster but slightly lower quality).
num_frames Type: integerDefault: 121Range: 9 - 257: Number of frames per segment. Use 257 for ~10.7s segments at 24fps (recommended for long videos).
resolution Default: 720: Resolution for the output video (height in pixels).
aspect_ratio Default: match_input_image: Aspect ratio for the output video.
guidance_scale Type: numberDefault: 3Range: 1 - 10: Guidance scale. Recommended range: 3.0-3.5.
negative_prompt Type: stringDefault: worst quality, inconsistent motion, blurry, jittery, distorted: Negative prompt for video generation.
denoise_strength Type: numberDefault: 0.4Range: 0 - 1: Denoising strength for final refinement step.
downscale_factor Type: numberDefault: 0.667Range: 0.1 - 1: Factor to downscale initial generation (recommended: 2/3 for better quality).
conditioning_frames Type: integerDefault: 21Range: 1 - 50: Number of frames to use for video-to-video conditioning (only used when video input is provided).
num_inference_steps Type: integerDefault: 24Range: 2 - 50: Number of denoising steps for initial generation.
max_duration_seconds Type: integerDefault: 5Range: 5 - 60: Target video duration in seconds. Videos longer than 10s use autoregressive generation.
final_inference_steps Type: integerDefault: 10Range: 1 - 50: Number of inference steps for final denoising.

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Using seed: 41686
Warning: match_input_image selected but no image provided. Using 16:9 aspect ratio.
Rounded dimensions for optimal processing: 853x480 -> 864x480
Using resolution: 864x480 (16:9)
Step 1: Text-to-Video generation at downscaled resolution: 576x320
Using fast mode with optimized parameters
  0%|          | 0/12 [00:00<?, ?it/s]
  8%|▊         | 1/12 [00:00<00:04,  2.26it/s]
 17%|█▋        | 2/12 [00:00<00:03,  2.73it/s]
 25%|██▌       | 3/12 [00:01<00:03,  2.60it/s]
 33%|███▎      | 4/12 [00:01<00:03,  2.54it/s]
 42%|████▏     | 5/12 [00:01<00:02,  2.52it/s]
 50%|█████     | 6/12 [00:02<00:02,  2.50it/s]
 58%|█████▊    | 7/12 [00:02<00:02,  2.49it/s]
 67%|██████▋   | 8/12 [00:03<00:01,  2.47it/s]
 75%|███████▌  | 9/12 [00:03<00:01,  2.47it/s]
 83%|████████▎ | 10/12 [00:04<00:00,  2.47it/s]
 92%|█████████▏| 11/12 [00:04<00:00,  2.46it/s]
100%|██████████| 12/12 [00:04<00:00,  2.46it/s]
100%|██████████| 12/12 [00:04<00:00,  2.49it/s]
Step 2: Upsampling to: 1152x640
Step 3: Final denoising with 10 steps
  0%|          | 0/2 [00:00<?, ?it/s]
 50%|█████     | 1/2 [00:01<00:01,  1.97s/it]
100%|██████████| 2/2 [00:03<00:00,  1.73s/it]
100%|██████████| 2/2 [00:03<00:00,  1.76s/it]
Step 4: Final resize to: 853x480
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (853, 480) to (864, 480) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).
[+] Text-to-Video generation complete: /tmp/output.mp4

Version Details

Version ID: 6757cbcee0253dca9e6c4df0e026c009b58673bbaaf1d88d3f4058cfc692fba5
Version Created: July 24, 2025

Run on Replicate →