jyoung105/stable-cascade π’π β πΌοΈ
About
WΓΌrstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Example Output
Prompt:
"A man with hoodie on, illustration"
Output
Performance Metrics
14.02s
Prediction Time
139.58s
Total Time
All Input Parameters
{
"width": 1024,
"height": 1024,
"prompt": "A man with hoodie on, illustration",
"num_images": 1,
"steps_prior": 20,
"steps_decoder": 10,
"guidance_scale_prior": 4,
"guidance_scale_decoder": 0
}
Input Parameters
- seed
- Random seed. Leave blank to randomize the seed.
- width
- Width of the output image.
- height
- Height of the output image.
- prompt
- Input prompt, text of what you want to generate.
- num_images
- Number of output images.
- steps_prior
- Number of denoising steps in prior.
- steps_decoder
- Number of denoising steps in decoder.
- negative_prompt
- Input negative prompt, text of what you don't want to generate.
- guidance_scale_prior
- Scale for classifier-free guidance in prior.
- guidance_scale_decoder
- Scale for classifier-free guidance in decoder.
Output Schema
Output
Example Execution Logs
DEVICE: cuda DTYPE: torch.float16 Using seed: 50236 Finish setup in 0.00011014938354492188 secs. [Debug] Prompt: A man with hoodie on, illustration, best quality, high detail, sharp focus 0%| | 0/20 [00:00<?, ?it/s] 5%|β | 1/20 [00:01<00:29, 1.54s/it] 10%|β | 2/20 [00:01<00:12, 1.40it/s] 15%|ββ | 3/20 [00:01<00:07, 2.21it/s] 20%|ββ | 4/20 [00:01<00:05, 3.04it/s] 25%|βββ | 5/20 [00:02<00:03, 3.83it/s] 30%|βββ | 6/20 [00:02<00:03, 4.54it/s] 35%|ββββ | 7/20 [00:02<00:02, 5.14it/s] 40%|ββββ | 8/20 [00:02<00:02, 5.63it/s] 45%|βββββ | 9/20 [00:02<00:01, 6.03it/s] 50%|βββββ | 10/20 [00:02<00:01, 6.33it/s] 55%|ββββββ | 11/20 [00:02<00:01, 6.54it/s] 60%|ββββββ | 12/20 [00:03<00:01, 6.70it/s] 65%|βββββββ | 13/20 [00:03<00:01, 6.78it/s] 70%|βββββββ | 14/20 [00:03<00:00, 6.89it/s] 75%|ββββββββ | 15/20 [00:03<00:00, 6.94it/s] 80%|ββββββββ | 16/20 [00:03<00:00, 6.97it/s] 85%|βββββββββ | 17/20 [00:03<00:00, 7.00it/s] 90%|βββββββββ | 18/20 [00:03<00:00, 7.03it/s] 95%|ββββββββββ| 19/20 [00:04<00:00, 7.06it/s] 100%|ββββββββββ| 20/20 [00:04<00:00, 7.07it/s] 100%|ββββββββββ| 20/20 [00:04<00:00, 4.75it/s] 0%| | 0/10 [00:00<?, ?it/s] 10%|β | 1/10 [00:01<00:17, 1.93s/it] 20%|ββ | 2/10 [00:02<00:07, 1.12it/s] 30%|βββ | 3/10 [00:02<00:03, 1.79it/s] 40%|ββββ | 4/10 [00:02<00:02, 2.48it/s] 50%|βββββ | 5/10 [00:02<00:01, 3.17it/s] 60%|ββββββ | 6/10 [00:02<00:01, 3.80it/s] 70%|βββββββ | 7/10 [00:02<00:00, 4.34it/s] 80%|ββββββββ | 8/10 [00:03<00:00, 4.80it/s] 90%|βββββββββ | 9/10 [00:03<00:00, 5.16it/s] 100%|ββββββββββ| 10/10 [00:03<00:00, 5.43it/s] 100%|ββββββββββ| 10/10 [00:03<00:00, 2.95it/s] Finish generation in 12.288098096847534 secs.
Version Details
- Version ID
36c40b9cc271abb8bc4a0f8cbb59c68d0739ad076648533eac1ca7a56d268b0d- Version Created
- November 22, 2024