joetm/camerabooth-openpose-style 📝🔢❓🖼️✓ → 🖼️

▶️ 478 runs 📅 Sep 2023 ⚙️ Cog 0.8.6

image-style-transfer image-to-image openpose

Performance

45.6sTypical run time

478Total runs

About

StableDiffusion 1.4 + T2IAdapter (ControlNet) with style and openpose adapters + two upscaling passes with Real-ESRGAN

Example Output

Output

Performance Metrics

45.64s Prediction Time

45.67s Total Time

All Input Parameters

{
  "code": "",
  "seed": 0,
  "booth": "public",
  "image": "https://replicate.delivery/pbxt/JdXiguR2wRjl03jOe4fLpfrEEewitrSFTStYxiBpM6Qa6Xqm/oliver-ragfelt-m79taQSsQIQ-unsplash.jpg",
  "steps": 50,
  "prompt": "",
  "artwork": "https://replicate.delivery/pbxt/JdXihCVozhMdqBwECWdGRkbbgib7elGQnmJcheaupHfgKTIt/child-with-dove-1901.jpg",
  "cond_tau": 1,
  "clip_mode": "best",
  "n_samples": 1,
  "timestamp": 0,
  "add_prompt": "",
  "neg_prompt": "(((nsfw))), (((text))), (((words))), ((low quality)), worst quality, bad quality, ((bad art)), lowres, ((disfigured)), ((deformed)), ((mutilated)), glitch, ((distorted)), malformed, mutated, (((disfigured))), misaligned, poorly drawn, (((blurry))), ((blurred)), mutated, bad arms, ((extra limbs)), missing arms, missing legs, disconnected limbs, bad hands, ((poorly drawn hands)), ((poorly drawn face)), deformed hands, ((extra fingers)), ((extra legs)), fused fingers, (too many fingers), mutated hands, amputated limbs, no arms, extra arms, multiple arms, more than two legs, long neck, bad proportions, longbody, bad anatomy, missing fingers, extra digit, fewer digits, deformed eyes, poorly drawn eyes, cross-eye, ((cross-eyed)), bad skin, multiple, duplicated, by Bad Artist, monochrome, monotone, grayscale, b&w, sketches, speech bubble, signature, watermark, border, logo, ((morbid)), canvas frame, frame, 3d",
  "face_enhance": true,
  "style_weight": 1,
  "use_upscaler": true,
  "detect_prompt": true,
  "guidance_scale": 8,
  "style_cond_tau": 0.8,
  "upscale_factor": 2,
  "openpose_weight": 1
}

Input Parameters

code Type: stringDefault:: The keyphrase given to participants. Autofilled - leave blank.
seed Type: integerDefault: 0: Random seed (optional). Set to 0 to randomize the seed
booth Default: public: The booth version (public or private). Autofilled - leave blank.
image (required) Type: string: Input pose image
steps Type: integerDefault: 50Range: 1 - 100: Number of denoising steps. Default: 50.
prompt Type: stringDefault:: Input prompt (optional). This field is ignored if detect_prompt = True.
artwork (required) Type: string: Input artwork image
cond_tau Type: numberDefault: 1Range: 0 - 1: Conditioning tau. Default: 1.0.
clip_mode Default: best: CLIP Interrogator prompt mode (best takes 10-20 seconds, fast takes 1-2 seconds). Default: best. The field is ignored if detect_prompt = False.
n_samples Type: integerDefault: 1Range: 1 - 4: Number of images to generate. Default: 1.
timestamp Type: integerDefault: 0: Unix timestamp. Autofilled - leave blank.
add_prompt Type: stringDefault:: Additional prompt (optional), appended to the prompt
neg_prompt Type: stringDefault: (((nsfw))), (((text))), (((words))), ((low quality)), worst quality, bad quality, ((bad art)), lowres, ((disfigured)), ((deformed)), ((mutilated)), glitch, ((distorted)), malformed, mutated, (((disfigured))), misaligned, poorly drawn, (((blurry))), ((blurred)), mutated, bad arms, ((extra limbs)), missing arms, missing legs, disconnected limbs, bad hands, ((poorly drawn hands)), ((poorly drawn face)), deformed hands, ((extra fingers)), ((extra legs)), fused fingers, (too many fingers), mutated hands, amputated limbs, no arms, extra arms, multiple arms, more than two legs, long neck, bad proportions, longbody, bad anatomy, missing fingers, extra digit, fewer digits, deformed eyes, poorly drawn eyes, cross-eye, ((cross-eyed)), bad skin, multiple, duplicated, by Bad Artist, monochrome, monotone, grayscale, b&w, sketches, speech bubble, signature, watermark, border, logo, ((morbid)), canvas frame, frame, 3d: Negative prompt (optional).
face_enhance Type: booleanDefault: true: Run GFPGAN face enhancement along with upscaling
style_weight Type: numberDefault: 1Range: 0 - 1: Weight of the style adapter. Default: 1.0.
use_upscaler Type: booleanDefault: true: Upscale the resulting image(s) by upscale_factor?
detect_prompt Type: booleanDefault: true: Use CLIP-Interrogator to detect prompt from the artwork. Default: True.
guidance_scale Type: numberDefault: 8Range: 1 - 50: Scale for classifier-free guidance. Default: 7.5.
style_cond_tau Type: numberDefault: 0.8Range: 0 - 1: Style conditioning tau. Default: 0.8.
upscale_factor Type: numberDefault: 2Range: 0 - 4: Factor to scale image by
openpose_weight Type: numberDefault: 1Range: 0 - 1: Weight of the openpose adapter. Default: 1.0.

Output Schema

Output

Type: array • Items Type: string • Items Format: uri

Example Execution Logs

Using negative prompt: (((nsfw))), (((text))), (((words))), ((low quality)), worst quality, bad quality, ((bad art)), lowres, ((disfigured)), ((deformed)), ((mutilated)), glitch, ((distorted)), malformed, mutated, (((disfigured))), misaligned, poorly drawn, (((blurry))), ((blurred)), mutated, bad arms, ((extra limbs)), missing arms, missing legs, disconnected limbs, bad hands, ((poorly drawn hands)), ((poorly drawn face)), deformed hands, ((extra fingers)), ((extra legs)), fused fingers, (too many fingers), mutated hands, amputated limbs, no arms, extra arms, multiple arms, more than two legs, long neck, bad proportions, longbody, bad anatomy, missing fingers, extra digit, fewer digits, deformed eyes, poorly drawn eyes, cross-eye, ((cross-eyed)), bad skin, multiple, duplicated, by Bad Artist, monochrome, monotone, grayscale, b&w, sketches, speech bubble, signature, watermark, border, logo, ((morbid)), canvas frame, frame, 3d
Detecting prompt with CI (ViT-L-14/openai)...
Global seed set to 12087
Loading caption model blip-large...
Loading CLIP model ViT-L-14/openai...
Loaded CLIP model and data in 3.96 seconds.
  0%|          | 0/55 [00:00<?, ?it/s]
 80%|████████  | 44/55 [00:00<00:00, 431.30it/s]
100%|██████████| 55/55 [00:00<00:00, 436.43it/s]
Flavor chain:   0%|          | 0/32 [00:00<?, ?it/s]
Flavor chain:   3%|▎         | 1/32 [00:00<00:25,  1.23it/s]
Flavor chain:   6%|▋         | 2/32 [00:01<00:24,  1.21it/s]
Flavor chain:   9%|▉         | 3/32 [00:02<00:24,  1.19it/s]
Flavor chain:  12%|█▎        | 4/32 [00:03<00:23,  1.18it/s]
Flavor chain:  16%|█▌        | 5/32 [00:04<00:23,  1.17it/s]
Flavor chain:  19%|█▉        | 6/32 [00:05<00:22,  1.15it/s]
Flavor chain:  22%|██▏       | 7/32 [00:06<00:21,  1.14it/s]
Flavor chain:  25%|██▌       | 8/32 [00:06<00:21,  1.13it/s]
Flavor chain:  28%|██▊       | 9/32 [00:07<00:20,  1.11it/s]
Flavor chain:  31%|███▏      | 10/32 [00:08<00:20,  1.09it/s]
Flavor chain:  34%|███▍      | 11/32 [00:09<00:19,  1.07it/s]
Flavor chain:  38%|███▊      | 12/32 [00:10<00:18,  1.06it/s]
Flavor chain:  41%|████      | 13/32 [00:11<00:18,  1.04it/s]
Flavor chain:  44%|████▍     | 14/32 [00:12<00:17,  1.02it/s]
Flavor chain:  44%|████▍     | 14/32 [00:13<00:17,  1.01it/s]
  0%|          | 0/55 [00:00<?, ?it/s]
 78%|███████▊  | 43/55 [00:00<00:00, 429.39it/s]
100%|██████████| 55/55 [00:00<00:00, 434.66it/s]
  0%|          | 0/6 [00:00<?, ?it/s]
100%|██████████| 6/6 [00:00<00:00, 317.31it/s]
  0%|          | 0/50 [00:00<?, ?it/s]
 90%|█████████ | 45/50 [00:00<00:00, 447.75it/s]
100%|██████████| 50/50 [00:00<00:00, 452.42it/s]
Prompt: painting of a child holding a bird in a field with a ball, style of picasso, white dove, macaron, midday photograph, platypus, listing image, nagasaki, guggimon, by Christian Jane Fergusson, head in hands, bluejay, single color, holding a white flag, nouvelle vague
Enabling style adapter.
Enabling openpose adapter.
Data shape for DDIM sampling is (1, 4, 96, 64), eta 0.0
Running DDIM Sampling with 50 timesteps
DDIM Sampler:   0%|          | 0/50 [00:00<?, ?it/s]
DDIM Sampler:   2%|▏         | 1/50 [00:00<00:12,  4.04it/s]
DDIM Sampler:   4%|▍         | 2/50 [00:00<00:11,  4.17it/s]
DDIM Sampler:   6%|▌         | 3/50 [00:00<00:11,  4.23it/s]
DDIM Sampler:   8%|▊         | 4/50 [00:00<00:10,  4.25it/s]
DDIM Sampler:  10%|█         | 5/50 [00:01<00:10,  4.26it/s]
DDIM Sampler:  12%|█▏        | 6/50 [00:01<00:10,  4.27it/s]
DDIM Sampler:  14%|█▍        | 7/50 [00:01<00:10,  4.28it/s]
DDIM Sampler:  16%|█▌        | 8/50 [00:01<00:09,  4.28it/s]
DDIM Sampler:  18%|█▊        | 9/50 [00:02<00:09,  4.29it/s]
DDIM Sampler:  20%|██        | 10/50 [00:02<00:09,  4.28it/s]
DDIM Sampler:  22%|██▏       | 11/50 [00:02<00:09,  4.29it/s]
DDIM Sampler:  24%|██▍       | 12/50 [00:02<00:08,  4.29it/s]
DDIM Sampler:  26%|██▌       | 13/50 [00:03<00:08,  4.29it/s]
DDIM Sampler:  28%|██▊       | 14/50 [00:03<00:08,  4.29it/s]
DDIM Sampler:  30%|███       | 15/50 [00:03<00:08,  4.29it/s]
DDIM Sampler:  32%|███▏      | 16/50 [00:03<00:07,  4.29it/s]
DDIM Sampler:  34%|███▍      | 17/50 [00:03<00:07,  4.29it/s]
DDIM Sampler:  36%|███▌      | 18/50 [00:04<00:07,  4.29it/s]
DDIM Sampler:  38%|███▊      | 19/50 [00:04<00:07,  4.29it/s]
DDIM Sampler:  40%|████      | 20/50 [00:04<00:06,  4.29it/s]
DDIM Sampler:  42%|████▏     | 21/50 [00:04<00:06,  4.29it/s]
DDIM Sampler:  44%|████▍     | 22/50 [00:05<00:06,  4.29it/s]
DDIM Sampler:  46%|████▌     | 23/50 [00:05<00:06,  4.29it/s]
DDIM Sampler:  48%|████▊     | 24/50 [00:05<00:06,  4.28it/s]
DDIM Sampler:  50%|█████     | 25/50 [00:05<00:05,  4.29it/s]
DDIM Sampler:  52%|█████▏    | 26/50 [00:06<00:05,  4.29it/s]
DDIM Sampler:  54%|█████▍    | 27/50 [00:06<00:05,  4.29it/s]
DDIM Sampler:  56%|█████▌    | 28/50 [00:06<00:05,  4.29it/s]
DDIM Sampler:  58%|█████▊    | 29/50 [00:06<00:04,  4.29it/s]
DDIM Sampler:  60%|██████    | 30/50 [00:07<00:04,  4.29it/s]
DDIM Sampler:  62%|██████▏   | 31/50 [00:07<00:04,  4.28it/s]
DDIM Sampler:  64%|██████▍   | 32/50 [00:07<00:04,  4.28it/s]
DDIM Sampler:  66%|██████▌   | 33/50 [00:07<00:03,  4.29it/s]
DDIM Sampler:  68%|██████▊   | 34/50 [00:07<00:03,  4.29it/s]
DDIM Sampler:  70%|███████   | 35/50 [00:08<00:03,  4.29it/s]
DDIM Sampler:  72%|███████▏  | 36/50 [00:08<00:03,  4.29it/s]
DDIM Sampler:  74%|███████▍  | 37/50 [00:08<00:03,  4.29it/s]
DDIM Sampler:  76%|███████▌  | 38/50 [00:08<00:02,  4.29it/s]
DDIM Sampler:  78%|███████▊  | 39/50 [00:09<00:02,  4.29it/s]
DDIM Sampler:  80%|████████  | 40/50 [00:09<00:02,  4.29it/s]
DDIM Sampler:  82%|████████▏ | 41/50 [00:09<00:02,  4.29it/s]
DDIM Sampler:  84%|████████▍ | 42/50 [00:09<00:01,  4.29it/s]
DDIM Sampler:  86%|████████▌ | 43/50 [00:10<00:01,  4.29it/s]
DDIM Sampler:  88%|████████▊ | 44/50 [00:10<00:01,  4.29it/s]
DDIM Sampler:  90%|█████████ | 45/50 [00:10<00:01,  4.29it/s]
DDIM Sampler:  92%|█████████▏| 46/50 [00:10<00:00,  4.29it/s]
DDIM Sampler:  94%|█████████▍| 47/50 [00:10<00:00,  4.29it/s]
DDIM Sampler:  96%|█████████▌| 48/50 [00:11<00:00,  4.29it/s]
DDIM Sampler:  98%|█████████▊| 49/50 [00:11<00:00,  4.30it/s]
DDIM Sampler: 100%|██████████| 50/50 [00:11<00:00,  4.30it/s]
DDIM Sampler: 100%|██████████| 50/50 [00:11<00:00,  4.28it/s]
Generated outputs/image0-steps50-cfg8.0-face_enhance.png
Running upscaler with face enhancement
Image 0, upscaling 1/2, 2.0x
Image 0, upscaling 2/2, 4.0x
Written poseimg to outputs/openpose.png
Done. 2 image(s).

Version Details

Version ID: 1833bc8044109a1d7767f6bcffa05e55fbf35eccba8a2f1e9b1be16b9ca56801
Version Created: September 25, 2023

Run on Replicate →