afiaka87/glid-3-xl 🖼️🔢❓📝✓ → 🖼️

▶️ 8.0K runs 📅 Jul 2022 ⚙️ Cog 0.3.13 🔗 GitHub ⚖️ License

image-inpainting image-to-image text-to-image

Performance

10.7sTypical run time

8.0KTotal runs

About

CompVis `latent-diffusion text2im` finetuned for inpainting.

Example Output

Prompt:

"pikachu rendered in pixar"

Output

https://replicate.delivery/mgxm/804c6fe5-d74a-4796-8835-f411b3776dae/current_0.png

Performance Metrics

10.70s Prediction Time

10.88s Total Time

All Input Parameters

{
  "seed": -1,
  "steps": 100,
  "width": 256,
  "height": 256,
  "prompt": "pikachu rendered in pixar",
  "batch_size": 1,
  "guidance_scale": 5,
  "aesthetic_rating": 9,
  "aesthetic_weight": 0.5
}

Input Parameters

mask Type: string: a mask image for inpainting an init_image. white pixels = keep, black pixels = discard. resized to width = image width/8, height = image height/8
seed Type: integerDefault: -1Range: -1 - 4294967295: Seed for random number generator. If -1, a random seed will be chosen.
steps Type: integerDefault: 50Range: 15 - 250: Number of diffusion steps to run. Due to PLMS sampling, using more than 100 steps is unnecessary and may simply produce the exact same output.
width Default: 256: Target width
height Default: 256: Target height
prompt Type: stringDefault:: Your text prompt.
negative Type: stringDefault:: (optional) Negate the model's prediction for this text from the model's prediction for the target text.
batch_size Type: integerDefault: 4Range: 1 - 16: Batch size. (higher = slower)
init_image Type: string: (optional) Initial image to use for the model's prediction. If provided alongside a mask, the image will be inpainted instead.
guidance_scale Type: numberDefault: 5Range: -20 - 100: Classifier-free guidance scale. Higher values will result in more guidance toward caption, with diminishing returns. Try values between 1.0 and 40.0. In general, going above 5.0 will introduce some artifacting.
aesthetic_rating Type: integerDefault: 9: Aesthetic rating (1-9) - embed to use.
aesthetic_weight Type: numberDefault: 0.5: Aesthetic weight (0-1). How much to guide towards the aesthetic embed vs the prompt embed.
init_skip_fraction Type: numberDefault: 0Range: 0 - 1: Fraction of sampling steps to skip when using an init image. Defaults to 0.0 if init_image is not specified and 0.5 if init_image is specified.
intermediate_outputs Type: booleanDefault: false: Whether to return intermediate outputs. Enable to visualize the diffusion process and/or debug the model. May slow down inference.

Output Schema

Output

Type: array • Items Type: array

Example Execution Logs

Using seed 2882092835
Running simulation for pikachu rendered in pixar
Encoding text embeddings with pikachu rendered in pixar dimensions
Using aesthetic embedding 9 with weight 0.5
Running diffusion...

  0%|          | 0/100 [00:00<?, ?it/s]
  1%|          | 1/100 [00:00<00:41,  2.37it/s]
  2%|▏         | 2/100 [00:00<00:37,  2.64it/s]
  3%|▎         | 3/100 [00:01<00:35,  2.74it/s]
  5%|▌         | 5/100 [00:01<00:19,  4.79it/s]
  7%|▋         | 7/100 [00:01<00:14,  6.47it/s]
  9%|▉         | 9/100 [00:01<00:11,  7.79it/s]
 11%|█         | 11/100 [00:01<00:10,  8.80it/s]
 13%|█▎        | 13/100 [00:01<00:09,  9.56it/s]
 15%|█▌        | 15/100 [00:02<00:08, 10.08it/s]
 17%|█▋        | 17/100 [00:02<00:07, 10.42it/s]
 19%|█▉        | 19/100 [00:02<00:07, 10.70it/s]
 21%|██        | 21/100 [00:02<00:07, 10.84it/s]
 23%|██▎       | 23/100 [00:02<00:06, 11.04it/s]
 25%|██▌       | 25/100 [00:03<00:06, 11.11it/s]
 27%|██▋       | 27/100 [00:03<00:06, 11.19it/s]
 29%|██▉       | 29/100 [00:03<00:06, 11.25it/s]
 31%|███       | 31/100 [00:03<00:06, 11.29it/s]
 33%|███▎      | 33/100 [00:03<00:05, 11.31it/s]
 35%|███▌      | 35/100 [00:03<00:05, 11.36it/s]
 37%|███▋      | 37/100 [00:04<00:05, 11.34it/s]
 39%|███▉      | 39/100 [00:04<00:05, 11.33it/s]
 41%|████      | 41/100 [00:04<00:05, 11.32it/s]
 43%|████▎     | 43/100 [00:04<00:05, 11.34it/s]
 45%|████▌     | 45/100 [00:04<00:04, 11.32it/s]
 47%|████▋     | 47/100 [00:04<00:04, 11.38it/s]
 49%|████▉     | 49/100 [00:05<00:04, 11.40it/s]
 51%|█████     | 51/100 [00:05<00:04, 11.36it/s]
 53%|█████▎    | 53/100 [00:05<00:04, 11.37it/s]
 55%|█████▌    | 55/100 [00:05<00:03, 11.39it/s]
 57%|█████▋    | 57/100 [00:05<00:03, 11.37it/s]
 59%|█████▉    | 59/100 [00:06<00:03, 11.36it/s]
 61%|██████    | 61/100 [00:06<00:03, 11.39it/s]
 63%|██████▎   | 63/100 [00:06<00:03, 11.36it/s]
 65%|██████▌   | 65/100 [00:06<00:03, 11.38it/s]
 67%|██████▋   | 67/100 [00:06<00:02, 11.36it/s]
 69%|██████▉   | 69/100 [00:06<00:02, 11.35it/s]
 71%|███████   | 71/100 [00:07<00:02, 11.33it/s]
 73%|███████▎  | 73/100 [00:07<00:02, 11.32it/s]
 75%|███████▌  | 75/100 [00:07<00:02, 11.35it/s]
 77%|███████▋  | 77/100 [00:07<00:02, 11.31it/s]
 79%|███████▉  | 79/100 [00:07<00:01, 11.30it/s]
 81%|████████  | 81/100 [00:07<00:01, 11.30it/s]
 83%|████████▎ | 83/100 [00:08<00:01, 11.30it/s]
 85%|████████▌ | 85/100 [00:08<00:01, 11.28it/s]
 87%|████████▋ | 87/100 [00:08<00:01, 11.31it/s]
 89%|████████▉ | 89/100 [00:08<00:00, 11.28it/s]
 91%|█████████ | 91/100 [00:08<00:00, 11.30it/s]
 93%|█████████▎| 93/100 [00:09<00:00, 11.31it/s]
 95%|█████████▌| 95/100 [00:09<00:00, 11.32it/s]
 97%|█████████▋| 97/100 [00:09<00:00, 11.33it/s]
 99%|█████████▉| 99/100 [00:09<00:00, 11.30it/s]
100%|██████████| 100/100 [00:09<00:00, 10.34it/s]
Saving final sample/s

Version Details

Version ID: d74db2a276065cf0d42fe9e2917219112ddf8c698f5d9acbe1cc353b58097dab
Version Created: August 5, 2022

Run on Replicate →