lucataco/ltx-video-0.9.8-distilled 🔢🖼️📝✓❓ → 🖼️
About
Generate native long-form video, with controllability
Example Output
Prompt:
"The turquoise waves crash against the dark, jagged rocks of the shore, sending white foam spraying into the air. The scene is dominated by the stark contrast between the bright blue water and the dark, almost black rocks. The water is a clear, turquoise color, and the waves are capped with white foam. The rocks are dark and jagged, and they are covered in patches of green moss. The shore is lined with lush green vegetation, including trees and bushes. In the background, there are rolling hills covered in dense forest. The sky is cloudy, and the light is dim"
Output
Performance Metrics
13.61s
Prediction Time
95.24s
Total Time
All Input Parameters
{ "fps": 24, "prompt": "The turquoise waves crash against the dark, jagged rocks of the shore, sending white foam spraying into the air. The scene is dominated by the stark contrast between the bright blue water and the dark, almost black rocks. The water is a clear, turquoise color, and the waves are capped with white foam. The rocks are dark and jagged, and they are covered in patches of green moss. The shore is lined with lush green vegetation, including trees and bushes. In the background, there are rolling hills covered in dense forest. The sky is cloudy, and the light is dim", "go_fast": true, "num_frames": 121, "resolution": 480, "aspect_ratio": "match_input_image", "guidance_scale": 3, "negative_prompt": "worst quality, inconsistent motion, blurry, jittery, distorted", "denoise_strength": 0.4, "downscale_factor": 0.667, "conditioning_frames": 21, "num_inference_steps": 24, "max_duration_seconds": 5, "final_inference_steps": 10 }
Input Parameters
- fps
- Frames per second for the output video.
- seed
- Random seed for reproducible results. Leave blank for a random seed.
- image
- Input image for image-to-video generation. If provided, will generate video from this image.
- video
- Input video for video-to-video generation. If provided, will generate video from this video. Takes precedence over image if both are provided.
- prompt (required)
- Text prompt for video generation
- go_fast
- Enable fast mode with skip layer strategies (20-40% faster but slightly lower quality).
- num_frames
- Number of frames per segment. Use 257 for ~10.7s segments at 24fps (recommended for long videos).
- resolution
- Resolution for the output video (height in pixels).
- aspect_ratio
- Aspect ratio for the output video.
- guidance_scale
- Guidance scale. Recommended range: 3.0-3.5.
- negative_prompt
- Negative prompt for video generation.
- denoise_strength
- Denoising strength for final refinement step.
- downscale_factor
- Factor to downscale initial generation (recommended: 2/3 for better quality).
- conditioning_frames
- Number of frames to use for video-to-video conditioning (only used when video input is provided).
- num_inference_steps
- Number of denoising steps for initial generation.
- max_duration_seconds
- Target video duration in seconds. Videos longer than 10s use autoregressive generation.
- final_inference_steps
- Number of inference steps for final denoising.
Output Schema
Output
Example Execution Logs
Using seed: 41686 Warning: match_input_image selected but no image provided. Using 16:9 aspect ratio. Rounded dimensions for optimal processing: 853x480 -> 864x480 Using resolution: 864x480 (16:9) Step 1: Text-to-Video generation at downscaled resolution: 576x320 Using fast mode with optimized parameters 0%| | 0/12 [00:00<?, ?it/s] 8%|▊ | 1/12 [00:00<00:04, 2.26it/s] 17%|█▋ | 2/12 [00:00<00:03, 2.73it/s] 25%|██▌ | 3/12 [00:01<00:03, 2.60it/s] 33%|███▎ | 4/12 [00:01<00:03, 2.54it/s] 42%|████▏ | 5/12 [00:01<00:02, 2.52it/s] 50%|█████ | 6/12 [00:02<00:02, 2.50it/s] 58%|█████▊ | 7/12 [00:02<00:02, 2.49it/s] 67%|██████▋ | 8/12 [00:03<00:01, 2.47it/s] 75%|███████▌ | 9/12 [00:03<00:01, 2.47it/s] 83%|████████▎ | 10/12 [00:04<00:00, 2.47it/s] 92%|█████████▏| 11/12 [00:04<00:00, 2.46it/s] 100%|██████████| 12/12 [00:04<00:00, 2.46it/s] 100%|██████████| 12/12 [00:04<00:00, 2.49it/s] Step 2: Upsampling to: 1152x640 Step 3: Final denoising with 10 steps 0%| | 0/2 [00:00<?, ?it/s] 50%|█████ | 1/2 [00:01<00:01, 1.97s/it] 100%|██████████| 2/2 [00:03<00:00, 1.73s/it] 100%|██████████| 2/2 [00:03<00:00, 1.76s/it] Step 4: Final resize to: 853x480 IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (853, 480) to (864, 480) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility). [+] Text-to-Video generation complete: /tmp/output.mp4
Version Details
- Version ID
6757cbcee0253dca9e6c4df0e026c009b58673bbaaf1d88d3f4058cfc692fba5
- Version Created
- July 24, 2025