chenxwh/ltx-video 🔢🖼️📝 → 🖼️

▶️ 3.4K runs 📅 Nov 2024 ⚙️ Cog 0.9.23 🔗 GitHub ⚖️ License
image-to-video real-time text-to-video

About

DiT-based video generation model for generating high-quality videos in real-time

Example Output

Prompt:

"The waves crash against the jagged rocks of the shoreline. The waves crash against the jagged rocks of the shoreline, sending spray high into the air.The rocks are a dark gray color, with sharp edges and deep crevices. The water is a clear blue-green, with white foam where the waves break against the rocks. The sky is a light gray, with a few white clouds dotting the horizon."

Output

Performance Metrics

31.15s Prediction Time
516.30s Total Time
All Input Parameters
{
  "width": 704,
  "height": 480,
  "prompt": "The waves crash against the jagged rocks of the shoreline. The waves crash against the jagged rocks of the shoreline, sending spray high into the air.The rocks are a dark gray color, with sharp edges and deep crevices. The water is a clear blue-green, with white foam where the waves break against the rocks. The sky is a light gray, with a few white clouds dotting the horizon.",
  "frame_rate": 25,
  "num_frames": 121,
  "guidance_scale": 3,
  "negative_prompt": "worst quality, inconsistent motion, blurry, jittery, distorted",
  "num_inference_steps": 40
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed
image Type: string
Optional, input image for image-to-video generation
width Type: integerDefault: 704Range: ∞ - 1280
Width of the output video frames. Optional if an input image provided
height Type: integerDefault: 480Range: ∞ - 720
Height of the output video frames. Optional if an input image provided
prompt Type: stringDefault: The waves crash against the jagged rocks of the shoreline. The waves crash against the jagged rocks of the shoreline, sending spray high into the air.The rocks are a dark gray color, with sharp edges and deep crevices. The water is a clear blue-green, with white foam where the waves break against the rocks. The sky is a light gray, with a few white clouds dotting the horizon.
Input prompt
frame_rate Type: integerDefault: 25
Frame rate for the output video
num_frames Type: integerDefault: 121Range: ∞ - 257
Number of frames to generate in the output video
guidance_scale Type: numberDefault: 3Range: 1 - 20
Scale for classifier-free guidance
negative_prompt Type: stringDefault: worst quality, inconsistent motion, blurry, jittery, distorted
Negative prompt for undesired features
num_inference_steps Type: integerDefault: 40Range: 1 - ∞
Number of denoising steps
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 27317
Setting `clean_caption=True` requires the Beautiful Soup library but it was not found in your environment. You can install it with pip:
`pip install beautifulsoup4`. Please note that you may need to restart your runtime after installation.
Setting `clean_caption` to False...
Setting `clean_caption=True` requires the Beautiful Soup library but it was not found in your environment. You can install it with pip:
`pip install beautifulsoup4`. Please note that you may need to restart your runtime after installation.
Setting `clean_caption` to False...
  0%|          | 0/40 [00:00<?, ?it/s]/root/.pyenv/versions/3.11.10/lib/python3.11/site-packages/torch/functional.py:534: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3595.)
return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
  2%|▎         | 1/40 [00:00<00:29,  1.32it/s]
  5%|▌         | 2/40 [00:01<00:20,  1.83it/s]
  8%|▊         | 3/40 [00:01<00:22,  1.63it/s]
 10%|█         | 4/40 [00:02<00:23,  1.55it/s]
 12%|█▎        | 5/40 [00:03<00:23,  1.51it/s]
 15%|█▌        | 6/40 [00:03<00:22,  1.49it/s]
 18%|█▊        | 7/40 [00:04<00:22,  1.48it/s]
 20%|██        | 8/40 [00:05<00:21,  1.47it/s]
 22%|██▎       | 9/40 [00:05<00:21,  1.46it/s]
 25%|██▌       | 10/40 [00:06<00:20,  1.46it/s]
 28%|██▊       | 11/40 [00:07<00:19,  1.46it/s]
 30%|███       | 12/40 [00:08<00:19,  1.46it/s]
 32%|███▎      | 13/40 [00:08<00:18,  1.45it/s]
 35%|███▌      | 14/40 [00:09<00:17,  1.45it/s]
 38%|███▊      | 15/40 [00:10<00:17,  1.45it/s]
 40%|████      | 16/40 [00:10<00:16,  1.45it/s]
 42%|████▎     | 17/40 [00:11<00:15,  1.45it/s]
 45%|████▌     | 18/40 [00:12<00:15,  1.45it/s]
 48%|████▊     | 19/40 [00:12<00:14,  1.45it/s]
 50%|█████     | 20/40 [00:13<00:13,  1.45it/s]
 52%|█████▎    | 21/40 [00:14<00:13,  1.45it/s]
 55%|█████▌    | 22/40 [00:14<00:12,  1.44it/s]
 57%|█████▊    | 23/40 [00:15<00:11,  1.44it/s]
 60%|██████    | 24/40 [00:16<00:11,  1.44it/s]
 62%|██████▎   | 25/40 [00:17<00:10,  1.44it/s]
 65%|██████▌   | 26/40 [00:17<00:09,  1.44it/s]
 68%|██████▊   | 27/40 [00:18<00:09,  1.44it/s]
 70%|███████   | 28/40 [00:19<00:08,  1.44it/s]
 72%|███████▎  | 29/40 [00:19<00:07,  1.44it/s]
 75%|███████▌  | 30/40 [00:20<00:06,  1.44it/s]
 78%|███████▊  | 31/40 [00:21<00:06,  1.44it/s]
 80%|████████  | 32/40 [00:21<00:05,  1.44it/s]
 82%|████████▎ | 33/40 [00:22<00:04,  1.44it/s]
 85%|████████▌ | 34/40 [00:23<00:04,  1.44it/s]
 88%|████████▊ | 35/40 [00:23<00:03,  1.44it/s]
 90%|█████████ | 36/40 [00:24<00:02,  1.44it/s]
 92%|█████████▎| 37/40 [00:25<00:02,  1.44it/s]
 95%|█████████▌| 38/40 [00:26<00:01,  1.44it/s]
 98%|█████████▊| 39/40 [00:26<00:00,  1.44it/s]
100%|██████████| 40/40 [00:27<00:00,  1.44it/s]
100%|██████████| 40/40 [00:27<00:00,  1.46it/s]
/src/ltx_video/pipelines/pipeline_ltx_video.py:1073: FutureWarning: Accessing config attribute `in_channels` directly via 'Transformer3DModel' object attribute is deprecated. Please access 'in_channels' over 'Transformer3DModel's config object instead, e.g. 'unet.config.in_channels'.
out_channels=self.transformer.in_channels
Version Details
Version ID
69599cebad125acfd3d5c682c187702ea7a84d537603e46b468a44fe94c5fd13
Version Created
November 24, 2024
Run on Replicate →