lightricks/ltx-video-0.9.7 🔢🖼️📝 → 🖼️

▶️ 2.4K runs 📅 May 2025 ⚙️ Cog 0.14.9 🔗 GitHub ⚖️ License
image-to-video text-to-video

About

DiT-based 13b video generation model, creating 30fps video

Example Output

Prompt:

"A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show."

Output

Performance Metrics

46.97s Prediction Time
136.11s Total Time
All Input Parameters
{
  "fps": 24,
  "width": 704,
  "height": 480,
  "prompt": "A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show.",
  "num_frames": 161,
  "guidance_scale": 3,
  "negative_prompt": "worst quality, inconsistent motion, blurry, jittery, distorted",
  "num_inference_steps": 50
}
Input Parameters
fps Type: integerDefault: 24
Frames per second for the output video.
seed Type: integer
Random seed for reproducible results. Leave blank for a random seed.
image Type: string
Input image for image-to-video generation. If not provided, text-to-video generation will be used.
width Type: integerDefault: 704
Width of the output video. Actual width will be a multiple of 32.
height Type: integerDefault: 480
Height of the output video. Actual height will be a multiple of 32.
prompt (required) Type: string
Text prompt for video generation
num_frames Type: integerDefault: 161
Number of frames to generate. Actual frame count will be 8N+1 (e.g., 9, 17, 25, 161).
guidance_scale Type: numberDefault: 3Range: 1 - 10
Guidance scale. Recommended range: 3.0-3.5.
negative_prompt Type: stringDefault: worst quality, inconsistent motion, blurry, jittery, distorted
Negative prompt for video generation.
num_inference_steps Type: integerDefault: 50
Number of denoising steps.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 10311
Original inputs: width=704, height=480, num_frames=161
Processed inputs: width=704, height=480, num_frames=161
[~] Using Text-to-Video pipeline
[~] Generating video with prompt: 'A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show.'
  0%|          | 0/50 [00:00<?, ?it/s]
  2%|▏         | 1/50 [00:00<00:36,  1.34it/s]
  4%|▍         | 2/50 [00:01<00:26,  1.80it/s]
  6%|▌         | 3/50 [00:01<00:29,  1.59it/s]
  8%|▊         | 4/50 [00:02<00:30,  1.51it/s]
 10%|█         | 5/50 [00:03<00:30,  1.47it/s]
 12%|█▏        | 6/50 [00:04<00:30,  1.45it/s]
 14%|█▍        | 7/50 [00:04<00:30,  1.43it/s]
 16%|█▌        | 8/50 [00:05<00:29,  1.42it/s]
 18%|█▊        | 9/50 [00:06<00:29,  1.41it/s]
 20%|██        | 10/50 [00:06<00:28,  1.41it/s]
 22%|██▏       | 11/50 [00:07<00:27,  1.41it/s]
 24%|██▍       | 12/50 [00:08<00:27,  1.40it/s]
 26%|██▌       | 13/50 [00:09<00:26,  1.40it/s]
 28%|██▊       | 14/50 [00:09<00:25,  1.40it/s]
 30%|███       | 15/50 [00:10<00:25,  1.40it/s]
 32%|███▏      | 16/50 [00:11<00:24,  1.40it/s]
 34%|███▍      | 17/50 [00:11<00:23,  1.40it/s]
 36%|███▌      | 18/50 [00:12<00:22,  1.40it/s]
 38%|███▊      | 19/50 [00:13<00:22,  1.39it/s]
 40%|████      | 20/50 [00:14<00:21,  1.39it/s]
 42%|████▏     | 21/50 [00:14<00:20,  1.39it/s]
 44%|████▍     | 22/50 [00:15<00:20,  1.39it/s]
 46%|████▌     | 23/50 [00:16<00:19,  1.39it/s]
 48%|████▊     | 24/50 [00:16<00:18,  1.39it/s]
 50%|█████     | 25/50 [00:17<00:17,  1.39it/s]
 52%|█████▏    | 26/50 [00:18<00:17,  1.39it/s]
 54%|█████▍    | 27/50 [00:19<00:16,  1.39it/s]
 56%|█████▌    | 28/50 [00:19<00:15,  1.39it/s]
 58%|█████▊    | 29/50 [00:20<00:15,  1.39it/s]
 60%|██████    | 30/50 [00:21<00:14,  1.39it/s]
 62%|██████▏   | 31/50 [00:21<00:13,  1.39it/s]
 64%|██████▍   | 32/50 [00:22<00:12,  1.39it/s]
 66%|██████▌   | 33/50 [00:23<00:12,  1.39it/s]
 68%|██████▊   | 34/50 [00:24<00:11,  1.39it/s]
 70%|███████   | 35/50 [00:24<00:10,  1.39it/s]
 72%|███████▏  | 36/50 [00:25<00:10,  1.39it/s]
 74%|███████▍  | 37/50 [00:26<00:09,  1.39it/s]
 76%|███████▌  | 38/50 [00:27<00:08,  1.39it/s]
 78%|███████▊  | 39/50 [00:27<00:07,  1.39it/s]
 80%|████████  | 40/50 [00:28<00:07,  1.38it/s]
 82%|████████▏ | 41/50 [00:29<00:06,  1.38it/s]
 84%|████████▍ | 42/50 [00:29<00:05,  1.38it/s]
 86%|████████▌ | 43/50 [00:30<00:05,  1.38it/s]
 88%|████████▊ | 44/50 [00:31<00:04,  1.38it/s]
 90%|█████████ | 45/50 [00:32<00:03,  1.38it/s]
 92%|█████████▏| 46/50 [00:32<00:02,  1.38it/s]
 94%|█████████▍| 47/50 [00:33<00:02,  1.38it/s]
 96%|█████████▌| 48/50 [00:34<00:01,  1.38it/s]
 98%|█████████▊| 49/50 [00:34<00:00,  1.38it/s]
100%|██████████| 50/50 [00:35<00:00,  1.38it/s]
100%|██████████| 50/50 [00:35<00:00,  1.40it/s]
[+] Video generation complete: output.mp4
Version Details
Version ID
b1a80c6dbce390c23bb52aecebc0e09d445ac12136dd4dc539350c76030fc815
Version Created
May 13, 2025
Run on Replicate →