bytonylee/stable-cascade 🔢📝 → 🖼️

▶️ 83 runs 📅 Sep 2024 ⚙️ Cog 0.13.2 🔗 GitHub 📄 Paper ⚖️ License

text-to-image

Performance

14.0sTypical run time

~140sCold start (first call)

83Total runs

About

Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models

Example Output

Prompt:

"A man with hoodie on, illustration"

Output

Performance Metrics

14.02s Prediction Time

139.58s Total Time

All Input Parameters

{
  "width": 1024,
  "height": 1024,
  "prompt": "A man with hoodie on, illustration",
  "num_images": 1,
  "steps_prior": 20,
  "steps_decoder": 10,
  "guidance_scale_prior": 4,
  "guidance_scale_decoder": 0
}

Input Parameters

seed Type: integer: Random seed. Leave blank to randomize the seed.
width Type: integerDefault: 1024Range: 1 - 2048: Width of the output image.
height Type: integerDefault: 1024Range: 1 - 2048: Height of the output image.
prompt Type: string: Input prompt, text of what you want to generate.
num_images Type: integerDefault: 1Range: 1 - 4: Number of output images.
steps_prior Type: integerDefault: 20Range: 1 - 50: Number of denoising steps in prior.
steps_decoder Type: integerDefault: 10Range: 1 - 50: Number of denoising steps in decoder.
negative_prompt Type: string: Input negative prompt, text of what you don't want to generate.
guidance_scale_prior Type: numberDefault: 4Range: 0 - 20: Scale for classifier-free guidance in prior.
guidance_scale_decoder Type: numberDefault: 0Range: 0 - 20: Scale for classifier-free guidance in decoder.

Output Schema

Output

Type: array • Items Type: string • Items Format: uri

Example Execution Logs

DEVICE: cuda
DTYPE: torch.float16
Using seed: 50236
Finish setup in 0.00011014938354492188 secs.
[Debug] Prompt: A man with hoodie on, illustration, best quality, high detail, sharp focus
  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:01<00:29,  1.54s/it]
 10%|█         | 2/20 [00:01<00:12,  1.40it/s]
 15%|█▌        | 3/20 [00:01<00:07,  2.21it/s]
 20%|██        | 4/20 [00:01<00:05,  3.04it/s]
 25%|██▌       | 5/20 [00:02<00:03,  3.83it/s]
 30%|███       | 6/20 [00:02<00:03,  4.54it/s]
 35%|███▌      | 7/20 [00:02<00:02,  5.14it/s]
 40%|████      | 8/20 [00:02<00:02,  5.63it/s]
 45%|████▌     | 9/20 [00:02<00:01,  6.03it/s]
 50%|█████     | 10/20 [00:02<00:01,  6.33it/s]
 55%|█████▌    | 11/20 [00:02<00:01,  6.54it/s]
 60%|██████    | 12/20 [00:03<00:01,  6.70it/s]
 65%|██████▌   | 13/20 [00:03<00:01,  6.78it/s]
 70%|███████   | 14/20 [00:03<00:00,  6.89it/s]
 75%|███████▌  | 15/20 [00:03<00:00,  6.94it/s]
 80%|████████  | 16/20 [00:03<00:00,  6.97it/s]
 85%|████████▌ | 17/20 [00:03<00:00,  7.00it/s]
 90%|█████████ | 18/20 [00:03<00:00,  7.03it/s]
 95%|█████████▌| 19/20 [00:04<00:00,  7.06it/s]
100%|██████████| 20/20 [00:04<00:00,  7.07it/s]
100%|██████████| 20/20 [00:04<00:00,  4.75it/s]
  0%|          | 0/10 [00:00<?, ?it/s]
 10%|█         | 1/10 [00:01<00:17,  1.93s/it]
 20%|██        | 2/10 [00:02<00:07,  1.12it/s]
 30%|███       | 3/10 [00:02<00:03,  1.79it/s]
 40%|████      | 4/10 [00:02<00:02,  2.48it/s]
 50%|█████     | 5/10 [00:02<00:01,  3.17it/s]
 60%|██████    | 6/10 [00:02<00:01,  3.80it/s]
 70%|███████   | 7/10 [00:02<00:00,  4.34it/s]
 80%|████████  | 8/10 [00:03<00:00,  4.80it/s]
 90%|█████████ | 9/10 [00:03<00:00,  5.16it/s]
100%|██████████| 10/10 [00:03<00:00,  5.43it/s]
100%|██████████| 10/10 [00:03<00:00,  2.95it/s]
Finish generation in 12.288098096847534 secs.

Version Details

Version ID: 36c40b9cc271abb8bc4a0f8cbb59c68d0739ad076648533eac1ca7a56d268b0d
Version Created: November 22, 2024

Run on Replicate →