thudm/cogvideox-t2v 🔢📝 → 🖼️

▶️ 256 runs 📅 Sep 2024 ⚙️ Cog 0.9.23 🔗 GitHub 📄 Paper ⚖️ License
text-to-video

About

Text-to-Video Diffusion Models with An Expert Transformer

Example Output

Prompt:

"A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance."

Output

Performance Metrics

414.89s Prediction Time
496.99s Total Time
All Input Parameters
{
  "prompt": "A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance.",
  "num_frames": 49,
  "guidance_scale": 6,
  "num_inference_steps": 50
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed
prompt Type: stringDefault: A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance.
Input prompt
num_frames Type: integerDefault: 49
Number of frames for the output video
guidance_scale Type: numberDefault: 6Range: 1 - 20
Scale for classifier-free guidance
num_inference_steps Type: integerDefault: 50Range: 1 - 500
Number of denoising steps
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 61175
  0%|          | 0/50 [00:00<?, ?it/s]
  2%|▏         | 1/50 [00:13<10:44, 13.16s/it]
  4%|▍         | 2/50 [00:20<07:52,  9.85s/it]
  6%|▌         | 3/50 [00:28<06:53,  8.79s/it]
  8%|▊         | 4/50 [00:35<06:25,  8.38s/it]
 10%|█         | 5/50 [00:44<06:25,  8.58s/it]
 12%|█▏        | 6/50 [00:52<06:02,  8.24s/it]
 14%|█▍        | 7/50 [01:00<05:45,  8.03s/it]
 16%|█▌        | 8/50 [01:07<05:31,  7.90s/it]
 18%|█▊        | 9/50 [01:15<05:19,  7.80s/it]
 20%|██        | 10/50 [01:22<05:09,  7.74s/it]
 22%|██▏       | 11/50 [01:30<05:00,  7.71s/it]
 24%|██▍       | 12/50 [01:38<04:53,  7.71s/it]
 26%|██▌       | 13/50 [01:45<04:44,  7.69s/it]
 28%|██▊       | 14/50 [01:53<04:35,  7.67s/it]
 30%|███       | 15/50 [02:01<04:27,  7.65s/it]
 32%|███▏      | 16/50 [02:08<04:20,  7.65s/it]
 34%|███▍      | 17/50 [02:16<04:12,  7.64s/it]
 36%|███▌      | 18/50 [02:24<04:04,  7.64s/it]
 38%|███▊      | 19/50 [02:31<03:56,  7.64s/it]
 40%|████      | 20/50 [02:39<03:49,  7.64s/it]
 42%|████▏     | 21/50 [02:46<03:41,  7.64s/it]
 44%|████▍     | 22/50 [02:54<03:34,  7.64s/it]
 46%|████▌     | 23/50 [03:02<03:26,  7.64s/it]
 48%|████▊     | 24/50 [03:09<03:18,  7.65s/it]
 50%|█████     | 25/50 [03:17<03:11,  7.66s/it]
 52%|█████▏    | 26/50 [03:25<03:03,  7.66s/it]
 54%|█████▍    | 27/50 [03:32<02:56,  7.66s/it]
 56%|█████▌    | 28/50 [03:40<02:48,  7.65s/it]
 58%|█████▊    | 29/50 [03:48<02:40,  7.65s/it]
 60%|██████    | 30/50 [03:55<02:33,  7.65s/it]
 62%|██████▏   | 31/50 [04:03<02:25,  7.65s/it]
 64%|██████▍   | 32/50 [04:11<02:17,  7.65s/it]
 66%|██████▌   | 33/50 [04:18<02:10,  7.65s/it]
 68%|██████▊   | 34/50 [04:26<02:02,  7.65s/it]
 70%|███████   | 35/50 [04:34<01:54,  7.65s/it]
 72%|███████▏  | 36/50 [04:41<01:47,  7.65s/it]
 74%|███████▍  | 37/50 [04:49<01:39,  7.66s/it]
 76%|███████▌  | 38/50 [04:57<01:32,  7.75s/it]
 78%|███████▊  | 39/50 [05:05<01:24,  7.72s/it]
 80%|████████  | 40/50 [05:12<01:16,  7.70s/it]
 82%|████████▏ | 41/50 [05:20<01:09,  7.68s/it]
 84%|████████▍ | 42/50 [05:27<01:01,  7.67s/it]
 86%|████████▌ | 43/50 [05:35<00:53,  7.67s/it]
 88%|████████▊ | 44/50 [05:43<00:45,  7.66s/it]
 90%|█████████ | 45/50 [05:50<00:38,  7.66s/it]
 92%|█████████▏| 46/50 [05:59<00:31,  7.79s/it]
 94%|█████████▍| 47/50 [06:07<00:23,  7.96s/it]
 96%|█████████▌| 48/50 [06:15<00:15,  7.86s/it]
 98%|█████████▊| 49/50 [06:22<00:07,  7.88s/it]
100%|██████████| 50/50 [06:30<00:00,  7.81s/it]
100%|██████████| 50/50 [06:30<00:00,  7.81s/it]
Version Details
Version ID
e047b1d734c550671fb4de7f7df7f9341ed498b4aa7cd88b82533b60dfec33e3
Version Created
September 21, 2024
Run on Replicate →