joetm/camerabooth-openpose-style 📝🔢❓🖼️✓ → 🖼️
About
StableDiffusion 1.4 + T2IAdapter (ControlNet) with style and openpose adapters + two upscaling passes with Real-ESRGAN
Example Output
Output

Performance Metrics
45.64s
Prediction Time
45.67s
Total Time
All Input Parameters
{
"code": "",
"seed": 0,
"booth": "public",
"image": "https://replicate.delivery/pbxt/JdXiguR2wRjl03jOe4fLpfrEEewitrSFTStYxiBpM6Qa6Xqm/oliver-ragfelt-m79taQSsQIQ-unsplash.jpg",
"steps": 50,
"prompt": "",
"artwork": "https://replicate.delivery/pbxt/JdXihCVozhMdqBwECWdGRkbbgib7elGQnmJcheaupHfgKTIt/child-with-dove-1901.jpg",
"cond_tau": 1,
"clip_mode": "best",
"n_samples": 1,
"timestamp": 0,
"add_prompt": "",
"neg_prompt": "(((nsfw))), (((text))), (((words))), ((low quality)), worst quality, bad quality, ((bad art)), lowres, ((disfigured)), ((deformed)), ((mutilated)), glitch, ((distorted)), malformed, mutated, (((disfigured))), misaligned, poorly drawn, (((blurry))), ((blurred)), mutated, bad arms, ((extra limbs)), missing arms, missing legs, disconnected limbs, bad hands, ((poorly drawn hands)), ((poorly drawn face)), deformed hands, ((extra fingers)), ((extra legs)), fused fingers, (too many fingers), mutated hands, amputated limbs, no arms, extra arms, multiple arms, more than two legs, long neck, bad proportions, longbody, bad anatomy, missing fingers, extra digit, fewer digits, deformed eyes, poorly drawn eyes, cross-eye, ((cross-eyed)), bad skin, multiple, duplicated, by Bad Artist, monochrome, monotone, grayscale, b&w, sketches, speech bubble, signature, watermark, border, logo, ((morbid)), canvas frame, frame, 3d",
"face_enhance": true,
"style_weight": 1,
"use_upscaler": true,
"detect_prompt": true,
"guidance_scale": 8,
"style_cond_tau": 0.8,
"upscale_factor": 2,
"openpose_weight": 1
}
Input Parameters
- code
- The keyphrase given to participants. Autofilled - leave blank.
- seed
- Random seed (optional). Set to 0 to randomize the seed
- booth
- The booth version (public or private). Autofilled - leave blank.
- image (required)
- Input pose image
- steps
- Number of denoising steps. Default: 50.
- prompt
- Input prompt (optional). This field is ignored if detect_prompt = True.
- artwork (required)
- Input artwork image
- cond_tau
- Conditioning tau. Default: 1.0.
- clip_mode
- CLIP Interrogator prompt mode (best takes 10-20 seconds, fast takes 1-2 seconds). Default: best. The field is ignored if detect_prompt = False.
- n_samples
- Number of images to generate. Default: 1.
- timestamp
- Unix timestamp. Autofilled - leave blank.
- add_prompt
- Additional prompt (optional), appended to the prompt
- neg_prompt
- Negative prompt (optional).
- face_enhance
- Run GFPGAN face enhancement along with upscaling
- style_weight
- Weight of the style adapter. Default: 1.0.
- use_upscaler
- Upscale the resulting image(s) by upscale_factor?
- detect_prompt
- Use CLIP-Interrogator to detect prompt from the artwork. Default: True.
- guidance_scale
- Scale for classifier-free guidance. Default: 7.5.
- style_cond_tau
- Style conditioning tau. Default: 0.8.
- upscale_factor
- Factor to scale image by
- openpose_weight
- Weight of the openpose adapter. Default: 1.0.
Output Schema
Output
Example Execution Logs
Using negative prompt: (((nsfw))), (((text))), (((words))), ((low quality)), worst quality, bad quality, ((bad art)), lowres, ((disfigured)), ((deformed)), ((mutilated)), glitch, ((distorted)), malformed, mutated, (((disfigured))), misaligned, poorly drawn, (((blurry))), ((blurred)), mutated, bad arms, ((extra limbs)), missing arms, missing legs, disconnected limbs, bad hands, ((poorly drawn hands)), ((poorly drawn face)), deformed hands, ((extra fingers)), ((extra legs)), fused fingers, (too many fingers), mutated hands, amputated limbs, no arms, extra arms, multiple arms, more than two legs, long neck, bad proportions, longbody, bad anatomy, missing fingers, extra digit, fewer digits, deformed eyes, poorly drawn eyes, cross-eye, ((cross-eyed)), bad skin, multiple, duplicated, by Bad Artist, monochrome, monotone, grayscale, b&w, sketches, speech bubble, signature, watermark, border, logo, ((morbid)), canvas frame, frame, 3d Detecting prompt with CI (ViT-L-14/openai)... Global seed set to 12087 Loading caption model blip-large... Loading CLIP model ViT-L-14/openai... Loaded CLIP model and data in 3.96 seconds. 0%| | 0/55 [00:00<?, ?it/s] 80%|████████ | 44/55 [00:00<00:00, 431.30it/s] 100%|██████████| 55/55 [00:00<00:00, 436.43it/s] Flavor chain: 0%| | 0/32 [00:00<?, ?it/s] Flavor chain: 3%|▎ | 1/32 [00:00<00:25, 1.23it/s] Flavor chain: 6%|▋ | 2/32 [00:01<00:24, 1.21it/s] Flavor chain: 9%|▉ | 3/32 [00:02<00:24, 1.19it/s] Flavor chain: 12%|█▎ | 4/32 [00:03<00:23, 1.18it/s] Flavor chain: 16%|█▌ | 5/32 [00:04<00:23, 1.17it/s] Flavor chain: 19%|█▉ | 6/32 [00:05<00:22, 1.15it/s] Flavor chain: 22%|██▏ | 7/32 [00:06<00:21, 1.14it/s] Flavor chain: 25%|██▌ | 8/32 [00:06<00:21, 1.13it/s] Flavor chain: 28%|██▊ | 9/32 [00:07<00:20, 1.11it/s] Flavor chain: 31%|███▏ | 10/32 [00:08<00:20, 1.09it/s] Flavor chain: 34%|███▍ | 11/32 [00:09<00:19, 1.07it/s] Flavor chain: 38%|███▊ | 12/32 [00:10<00:18, 1.06it/s] Flavor chain: 41%|████ | 13/32 [00:11<00:18, 1.04it/s] Flavor chain: 44%|████▍ | 14/32 [00:12<00:17, 1.02it/s] Flavor chain: 44%|████▍ | 14/32 [00:13<00:17, 1.01it/s] 0%| | 0/55 [00:00<?, ?it/s] 78%|███████▊ | 43/55 [00:00<00:00, 429.39it/s] 100%|██████████| 55/55 [00:00<00:00, 434.66it/s] 0%| | 0/6 [00:00<?, ?it/s] 100%|██████████| 6/6 [00:00<00:00, 317.31it/s] 0%| | 0/50 [00:00<?, ?it/s] 90%|█████████ | 45/50 [00:00<00:00, 447.75it/s] 100%|██████████| 50/50 [00:00<00:00, 452.42it/s] Prompt: painting of a child holding a bird in a field with a ball, style of picasso, white dove, macaron, midday photograph, platypus, listing image, nagasaki, guggimon, by Christian Jane Fergusson, head in hands, bluejay, single color, holding a white flag, nouvelle vague Enabling style adapter. Enabling openpose adapter. Data shape for DDIM sampling is (1, 4, 96, 64), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 0%| | 0/50 [00:00<?, ?it/s] DDIM Sampler: 2%|▏ | 1/50 [00:00<00:12, 4.04it/s] DDIM Sampler: 4%|▍ | 2/50 [00:00<00:11, 4.17it/s] DDIM Sampler: 6%|▌ | 3/50 [00:00<00:11, 4.23it/s] DDIM Sampler: 8%|▊ | 4/50 [00:00<00:10, 4.25it/s] DDIM Sampler: 10%|█ | 5/50 [00:01<00:10, 4.26it/s] DDIM Sampler: 12%|█▏ | 6/50 [00:01<00:10, 4.27it/s] DDIM Sampler: 14%|█▍ | 7/50 [00:01<00:10, 4.28it/s] DDIM Sampler: 16%|█▌ | 8/50 [00:01<00:09, 4.28it/s] DDIM Sampler: 18%|█▊ | 9/50 [00:02<00:09, 4.29it/s] DDIM Sampler: 20%|██ | 10/50 [00:02<00:09, 4.28it/s] DDIM Sampler: 22%|██▏ | 11/50 [00:02<00:09, 4.29it/s] DDIM Sampler: 24%|██▍ | 12/50 [00:02<00:08, 4.29it/s] DDIM Sampler: 26%|██▌ | 13/50 [00:03<00:08, 4.29it/s] DDIM Sampler: 28%|██▊ | 14/50 [00:03<00:08, 4.29it/s] DDIM Sampler: 30%|███ | 15/50 [00:03<00:08, 4.29it/s] DDIM Sampler: 32%|███▏ | 16/50 [00:03<00:07, 4.29it/s] DDIM Sampler: 34%|███▍ | 17/50 [00:03<00:07, 4.29it/s] DDIM Sampler: 36%|███▌ | 18/50 [00:04<00:07, 4.29it/s] DDIM Sampler: 38%|███▊ | 19/50 [00:04<00:07, 4.29it/s] DDIM Sampler: 40%|████ | 20/50 [00:04<00:06, 4.29it/s] DDIM Sampler: 42%|████▏ | 21/50 [00:04<00:06, 4.29it/s] DDIM Sampler: 44%|████▍ | 22/50 [00:05<00:06, 4.29it/s] DDIM Sampler: 46%|████▌ | 23/50 [00:05<00:06, 4.29it/s] DDIM Sampler: 48%|████▊ | 24/50 [00:05<00:06, 4.28it/s] DDIM Sampler: 50%|█████ | 25/50 [00:05<00:05, 4.29it/s] DDIM Sampler: 52%|█████▏ | 26/50 [00:06<00:05, 4.29it/s] DDIM Sampler: 54%|█████▍ | 27/50 [00:06<00:05, 4.29it/s] DDIM Sampler: 56%|█████▌ | 28/50 [00:06<00:05, 4.29it/s] DDIM Sampler: 58%|█████▊ | 29/50 [00:06<00:04, 4.29it/s] DDIM Sampler: 60%|██████ | 30/50 [00:07<00:04, 4.29it/s] DDIM Sampler: 62%|██████▏ | 31/50 [00:07<00:04, 4.28it/s] DDIM Sampler: 64%|██████▍ | 32/50 [00:07<00:04, 4.28it/s] DDIM Sampler: 66%|██████▌ | 33/50 [00:07<00:03, 4.29it/s] DDIM Sampler: 68%|██████▊ | 34/50 [00:07<00:03, 4.29it/s] DDIM Sampler: 70%|███████ | 35/50 [00:08<00:03, 4.29it/s] DDIM Sampler: 72%|███████▏ | 36/50 [00:08<00:03, 4.29it/s] DDIM Sampler: 74%|███████▍ | 37/50 [00:08<00:03, 4.29it/s] DDIM Sampler: 76%|███████▌ | 38/50 [00:08<00:02, 4.29it/s] DDIM Sampler: 78%|███████▊ | 39/50 [00:09<00:02, 4.29it/s] DDIM Sampler: 80%|████████ | 40/50 [00:09<00:02, 4.29it/s] DDIM Sampler: 82%|████████▏ | 41/50 [00:09<00:02, 4.29it/s] DDIM Sampler: 84%|████████▍ | 42/50 [00:09<00:01, 4.29it/s] DDIM Sampler: 86%|████████▌ | 43/50 [00:10<00:01, 4.29it/s] DDIM Sampler: 88%|████████▊ | 44/50 [00:10<00:01, 4.29it/s] DDIM Sampler: 90%|█████████ | 45/50 [00:10<00:01, 4.29it/s] DDIM Sampler: 92%|█████████▏| 46/50 [00:10<00:00, 4.29it/s] DDIM Sampler: 94%|█████████▍| 47/50 [00:10<00:00, 4.29it/s] DDIM Sampler: 96%|█████████▌| 48/50 [00:11<00:00, 4.29it/s] DDIM Sampler: 98%|█████████▊| 49/50 [00:11<00:00, 4.30it/s] DDIM Sampler: 100%|██████████| 50/50 [00:11<00:00, 4.30it/s] DDIM Sampler: 100%|██████████| 50/50 [00:11<00:00, 4.28it/s] Generated outputs/image0-steps50-cfg8.0-face_enhance.png Running upscaler with face enhancement Image 0, upscaling 1/2, 2.0x Image 0, upscaling 2/2, 4.0x Written poseimg to outputs/openpose.png Done. 2 image(s).
Version Details
- Version ID
1833bc8044109a1d7767f6bcffa05e55fbf35eccba8a2f1e9b1be16b9ca56801- Version Created
- September 25, 2023