philz1337x/multidiffusion-upscaler 🔢🖼️📝❓✓ → 🖼️

▶️ 23.7K runs 📅 Mar 2024 ⚙️ Cog 0.8.0-beta11 📄 Paper
image-restoration image-upscaling

About

High resolution image Upscaler and Enhancer. Twitter/X: @philz1337x

Example Output

Prompt:

"masterpiece, best quality, highres, lora:more_details:0.5 lora:SDXLrender_v2.0:1"

Output

Example output

Performance Metrics

108.38s Prediction Time
554.96s Total Time
All Input Parameters
{
  "seed": 1337,
  "image": "https://replicate.delivery/pbxt/KZWaur3VOX61FKuhgCYsMUA7oJDI1tCVyGlTWEV3BIqLZOTe/2024-01-02%2009.18.49.jpg",
  "width": 512,
  "height": 512,
  "prompt": "masterpiece, best quality, highres, <lora:more_details:0.5> <lora:SDXLrender_v2.0:1>",
  "sd_vae": "vae-ft-mse-840000-ema-pruned.safetensors",
  "cn_model": "control_v11f1e_sd15_tile",
  "sd_model": "juggernaut_reborn.safetensors [338b85bc4f]",
  "cn_module": "tile_resample",
  "cn_weight": 0.6,
  "scheduler": "DPM++ 3M SDE Karras",
  "td_method": "MultiDiffusion",
  "cn_lowvram": false,
  "td_overlap": 4,
  "num_outputs": 1,
  "cn_downsample": 1,
  "td_tile_width": 112,
  "cn_resize_mode": 1,
  "cn_threshold_a": 1,
  "cn_threshold_b": 1,
  "guidance_scale": 6,
  "td_image_width": 1,
  "td_tile_height": 144,
  "cn_control_mode": 1,
  "cn_guidance_end": 1,
  "negative_prompt": "(worst quality, low quality, normal quality:2) JuggernautNegative-neg",
  "td_image_height": 1,
  "td_scale_factor": 2,
  "tv_fast_decoder": true,
  "tv_fast_encoder": true,
  "cn_pixel_perfect": true,
  "enable_tiled_vae": true,
  "td_noise_inverse": false,
  "td_upscaler_name": "4x-UltraSharp",
  "cn_guidance_start": 0,
  "enable_controlnet": true,
  "td_overwrite_size": true,
  "denoising_strength": 0.35,
  "td_keep_input_size": true,
  "td_tile_batch_size": 8,
  "tv_move_vae_to_gpu": true,
  "cn_preprocessor_res": 512,
  "num_inference_steps": 18,
  "tv_decoder_tile_size": 192,
  "tv_encoder_tile_size": 3072,
  "enable_tiled_diffusion": true,
  "td_noise_inverse_steps": 0,
  "clip_stop_at_last_layers": 1,
  "tv_fast_encoder_color_fix": true,
  "td_noise_inverse_renoise_kernel": 3,
  "td_noise_inverse_renoise_strength": 0
}
Input Parameters
seed Type: integerDefault: 1337
Random seed. Leave blank to randomize the seed
image (required) Type: string
input image
width Type: integerDefault: 512
Width of output image
height Type: integerDefault: 512
Height of output image
prompt Type: stringDefault: masterpiece, best quality, highres, <lora:more_details:0.5> <lora:SDXLrender_v2.0:1>
Prompt
sd_vae Default: vae-ft-mse-840000-ema-pruned.safetensors
Stable Diffusion VAE checkpoint
cn_model Default: control_v11f1e_sd15_tile
Controlnet model
sd_model Default: juggernaut_reborn.safetensors [338b85bc4f]
Stable Diffusion model checkpoint
cn_module Default: tile_resample
Controlnet module
cn_weight Type: numberDefault: 0.6
Controlnet weight
scheduler Default: DPM++ 3M SDE Karras
scheduler
td_method Default: MultiDiffusion
Tiled diffusion method
cn_lowvram Type: booleanDefault: false
Controlnet lowvram
td_overlap Type: integerDefault: 4
Overlap
num_outputs Type: integerDefault: 1Range: 1 - 4
Number of images to output
cn_downsample Type: numberDefault: 1
Controlnet downsample
td_tile_width Type: integerDefault: 112
Tile width
cn_resize_mode Type: integerDefault: 1
Controlnet resize mode
cn_threshold_a Type: integerDefault: 1
Controlnet threshold a
cn_threshold_b Type: integerDefault: 1
Controlnet threshold b
guidance_scale Type: numberDefault: 6Range: 1 - 50
Scale for classifier-free guidance
td_image_width Type: integerDefault: 1
Image width
td_tile_height Type: integerDefault: 144
Tile height
cn_control_mode Type: integerDefault: 1
Controlnet control mode. 0= Balanced, 1 = My prompt is more important, 2 = ControlNet is more important
cn_guidance_end Type: numberDefault: 1
Controlnet guidance end
negative_prompt Type: stringDefault: (worst quality, low quality, normal quality:2) JuggernautNegative-neg
Negative Prompt
td_image_height Type: integerDefault: 1
Image height
td_scale_factor Type: numberDefault: 2
Scale factor
tv_fast_decoder Type: booleanDefault: true
Fast decoder
tv_fast_encoder Type: booleanDefault: true
Fast encoder
cn_pixel_perfect Type: booleanDefault: true
Controlnet pixel perfect
enable_tiled_vae Type: booleanDefault: true
Enable tiled vae
td_noise_inverse Type: booleanDefault: false
Noise inverse
td_upscaler_name Default: 4x-UltraSharp
Upscaler name
cn_guidance_start Type: numberDefault: 0
Controlnet guidance start
enable_controlnet Type: booleanDefault: true
Enable controlnet
td_overwrite_size Type: booleanDefault: true
Overwrite size
denoising_strength Type: numberDefault: 0.35Range: 0 - 1
Denoising strength. 1.0 corresponds to full destruction of information in init image
td_keep_input_size Type: booleanDefault: true
Keep input size
td_tile_batch_size Type: integerDefault: 8
Tile batch size
tv_move_vae_to_gpu Type: booleanDefault: true
Move vae to gpu(if possible)
cn_preprocessor_res Type: integerDefault: 512
Controlnet preprocessor res
num_inference_steps Type: integerDefault: 18Range: 1 - 100
Number of denoising steps
tv_decoder_tile_size Type: integerDefault: 192
Decoder tile size
tv_encoder_tile_size Type: integerDefault: 3072
Encoder tile size
enable_tiled_diffusion Type: booleanDefault: true
Enable tiled diffusion
td_noise_inverse_steps Type: integerDefault: 0
Noise inverse steps
clip_stop_at_last_layers Type: integerDefault: 1
CLIP stop at last layers
tv_fast_encoder_color_fix Type: booleanDefault: true
Encoder color fix
td_noise_inverse_renoise_kernel Type: integerDefault: 3
Noise inverse renoise kernel
td_noise_inverse_renoise_strength Type: numberDefault: 0
Noise inverse renoise strength
Output Schema

Output

Type: arrayItems Type: stringItems Format: uri

Example Execution Logs
Creating model from config: /src/configs/v1-inference.yaml
2024-03-15 04:21:06,285 - ControlNet - INFO - ControlNet UI callback registered.
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
Couldn't find VAE named None; using None instead
Applying attention optimization: Doggettx... done.
Model loaded in 3.8s (load weights from disk: 0.1s, create model: 3.0s, apply weights to model: 0.4s).
Loading VAE weights specified in settings: /src/models/VAE/vae-ft-mse-840000-ema-pruned.safetensors
Applying attention optimization: Doggettx... done.
VAE weights loaded.
[Tiled Diffusion] upscaling image with 4x-UltraSharp...
[Tiled Diffusion] ControlNet found, support is enabled.
2024-03-15 04:21:18,999 - ControlNet - INFO - unit_separate = False, style_align = False
2024-03-15 04:21:19,257 - ControlNet - INFO - Loading model: control_v11f1e_sd15_tile [a371b31b]
2024-03-15 04:21:19,578 - ControlNet - INFO - Loaded state_dict from [/src/extensions/sd-webui-controlnet/models/control_v11f1e_sd15_tile.pth]
2024-03-15 04:21:19,578 - ControlNet - INFO - controlnet_default_config
2024-03-15 04:21:21,770 - ControlNet - INFO - ControlNet model control_v11f1e_sd15_tile [a371b31b](ControlModelType.ControlNet) loaded.
2024-03-15 04:21:21,815 - ControlNet - INFO - Using preprocessor: tile_resample
2024-03-15 04:21:21,815 - ControlNet - INFO - preprocessor resolution = 2288
2024-03-15 04:21:22,036 - ControlNet - INFO - ControlNet Hooked - Time = 3.041232109069824
MultiDiffusion hooked into 'DPM++ 3M SDE Karras' sampler, Tile size: 144x112, Tile count: 15, Batch size: 8, Tile batches: 2 (ext: ContrlNet)
[Tiled VAE]: input_size: torch.Size([1, 3, 2288, 4096]), tile_size: 3072, padding: 32
[Tiled VAE]: split to 1x2 = 2 tiles. Optimal tile size 2016x2240, original tile size 3072x3072
[Tiled VAE]: Fast mode enabled, estimating group norm parameters on 3072 x 1716 image
MultiDiffusion Sampling: : 0it [00:00, ?it/s]
[Tiled VAE]: Executing Encoder Task Queue:   0%|          | 0/182 [00:00<?, ?it/s]
[Tiled VAE]: Executing Encoder Task Queue:  10%|▉         | 18/182 [00:00<00:02, 77.37it/s]
[Tiled VAE]: Executing Encoder Task Queue:  14%|█▍        | 26/182 [00:00<00:03, 40.82it/s]
[Tiled VAE]: Executing Encoder Task Queue:  21%|██        | 38/182 [00:01<00:04, 33.10it/s]
[Tiled VAE]: Executing Encoder Task Queue:  23%|██▎       | 42/182 [00:01<00:05, 26.67it/s]
[Tiled VAE]: Executing Encoder Task Queue:  26%|██▋       | 48/182 [00:01<00:04, 29.20it/s]
[Tiled VAE]: Executing Encoder Task Queue:  29%|██▊       | 52/182 [00:01<00:06, 18.73it/s]
[Tiled VAE]: Executing Encoder Task Queue:  30%|███       | 55/182 [00:02<00:08, 15.39it/s]
[Tiled VAE]: Executing Encoder Task Queue:  32%|███▏      | 58/182 [00:02<00:09, 13.26it/s]
[Tiled VAE]: Executing Encoder Task Queue:  35%|███▌      | 64/182 [00:02<00:06, 17.21it/s]
[Tiled VAE]: Executing Encoder Task Queue:  37%|███▋      | 67/182 [00:03<00:07, 16.21it/s]
[Tiled VAE]: Executing Encoder Task Queue:  38%|███▊      | 70/182 [00:03<00:06, 17.51it/s]
[Tiled VAE]: Executing Encoder Task Queue:  40%|████      | 73/182 [00:03<00:05, 18.54it/s]
[Tiled VAE]: Executing Encoder Task Queue:  42%|████▏     | 76/182 [00:03<00:05, 18.91it/s]
[Tiled VAE]: Executing Encoder Task Queue:  46%|████▌     | 83/182 [00:03<00:04, 24.54it/s]
[Tiled VAE]: Executing Encoder Task Queue:  47%|████▋     | 86/182 [00:03<00:04, 23.38it/s]
[Tiled VAE]: Executing Encoder Task Queue:  49%|████▉     | 89/182 [00:03<00:04, 21.17it/s]
[Tiled VAE]: Executing Encoder Task Queue:  51%|█████     | 92/182 [00:04<00:04, 20.03it/s]
[Tiled VAE]: Executing Encoder Task Queue:  54%|█████▍    | 99/182 [00:04<00:02, 29.67it/s]
[Tiled VAE]: Executing Encoder Task Queue:  57%|█████▋    | 104/182 [00:04<00:02, 30.93it/s]
[Tiled VAE]: Executing Encoder Task Queue:  65%|██████▍   | 118/182 [00:04<00:01, 53.13it/s]
[Tiled VAE]: Executing Encoder Task Queue:  74%|███████▎  | 134/182 [00:04<00:00, 76.76it/s]
[Tiled VAE]: Executing Encoder Task Queue:  82%|████████▏ | 150/182 [00:04<00:00, 95.56it/s]
[Tiled VAE]: Executing Encoder Task Queue:  88%|████████▊ | 161/182 [00:05<00:00, 41.01it/s]
[Tiled VAE]: Executing Encoder Task Queue:  97%|█████████▋| 176/182 [00:05<00:00, 54.86it/s]
[Tiled VAE]: Executing Encoder Task Queue: 100%|██████████| 182/182 [00:05<00:00, 32.97it/s]
[Tiled VAE]: Done in 6.285s, max VRAM alloc 23998.532 MB
  0%|          | 0/7 [00:00<?, ?it/s]
MultiDiffusion Sampling: : 0it [00:13, ?it/s]
Total progress:   0%|          | 0/7 [00:00<?, ?it/s]
 14%|█▍        | 1/7 [00:11<01:11, 11.99s/it]
Total progress:  29%|██▊       | 2/7 [00:10<00:26,  5.25s/it]
 29%|██▊       | 2/7 [00:22<00:55, 11.12s/it]
Total progress:  43%|████▎     | 3/7 [00:21<00:29,  7.44s/it]
 43%|████▎     | 3/7 [00:33<00:43, 10.84s/it]
Total progress:  57%|█████▋    | 4/7 [00:31<00:25,  8.59s/it]
 57%|█████▋    | 4/7 [00:43<00:32, 10.71s/it]
Total progress:  71%|███████▏  | 5/7 [00:42<00:18,  9.26s/it]
 71%|███████▏  | 5/7 [00:54<00:21, 10.64s/it]
Total progress:  86%|████████▌ | 6/7 [00:52<00:09,  9.68s/it]
 86%|████████▌ | 6/7 [01:04<00:10, 10.60s/it]
100%|██████████| 7/7 [01:15<00:00, 10.57s/it]
100%|██████████| 7/7 [01:15<00:00, 10.73s/it]
Total progress: 100%|██████████| 7/7 [01:03<00:00,  9.95s/it][Tiled VAE]: input_size: torch.Size([1, 4, 286, 512]), tile_size: 192, padding: 11
[Tiled VAE]: split to 2x3 = 6 tiles. Optimal tile size 192x160, original tile size 192x192
[Tiled VAE]: Fast mode enabled, estimating group norm parameters on 192 x 107 image
[Tiled VAE]: Executing Decoder Task Queue:   0%|          | 0/738 [00:00<?, ?it/s]
[Tiled VAE]: Executing Decoder Task Queue:  17%|█▋        | 124/738 [00:00<00:02, 210.24it/s]
[Tiled VAE]: Executing Decoder Task Queue:  33%|███▎      | 247/738 [00:01<00:02, 211.90it/s]
[Tiled VAE]: Executing Decoder Task Queue:  50%|█████     | 370/738 [00:01<00:01, 263.89it/s]
[Tiled VAE]: Executing Decoder Task Queue:  67%|██████▋   | 493/738 [00:01<00:00, 282.74it/s]
[Tiled VAE]: Executing Decoder Task Queue:  83%|████████▎ | 616/738 [00:02<00:00, 294.08it/s]
[Tiled VAE]: Executing Decoder Task Queue: 100%|██████████| 738/738 [00:02<00:00, 295.40it/s]
[Tiled VAE]: Done in 3.381s, max VRAM alloc 9362.373 MB
Total progress: 100%|██████████| 7/7 [01:07<00:00,  9.95s/it]
Total progress: 100%|██████████| 7/7 [01:07<00:00,  9.57s/it]
Version Details
Version ID
88f19697c15c8befccd557649fe6b01fcfa55ade961c2d1c3c23d9c986fdaff7
Version Created
March 15, 2024
Run on Replicate →