zsxkib/instant-id 🔢🖼️📝❓✓ → 🖼️
About
Make realistic images of real people instantly
Example Output
Prompt:
"analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality"
Output
Performance Metrics
36.02s
Prediction Time
36.06s
Total Time
All Input Parameters
{
"image": "https://replicate.delivery/pbxt/KIIutO7jIleskKaWebhvurgBUlHR6M6KN7KHaMMWSt4OnVrF/musk_resize.jpeg",
"prompt": "analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality",
"scheduler": "EulerDiscreteScheduler",
"enable_lcm": false,
"pose_image": "https://replicate.delivery/pbxt/KJmFdQRQVDXGDVdVXftLvFrrvgOPXXRXbzIVEyExPYYOFPyF/80048a6e6586759dbcb529e74a9042ca.jpeg",
"num_outputs": 1,
"sdxl_weights": "protovision-xl-high-fidel",
"output_format": "webp",
"pose_strength": 0.4,
"canny_strength": 0.3,
"depth_strength": 0.5,
"guidance_scale": 5,
"output_quality": 80,
"negative_prompt": "(lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch,deformed, mutated, cross-eyed, ugly, disfigured",
"ip_adapter_scale": 0.8,
"lcm_guidance_scale": 1.5,
"num_inference_steps": 30,
"enable_pose_controlnet": true,
"enhance_nonface_region": true,
"enable_canny_controlnet": false,
"enable_depth_controlnet": false,
"lcm_num_inference_steps": 5,
"face_detection_input_width": 640,
"face_detection_input_height": 640,
"controlnet_conditioning_scale": 0.8
}
Input Parameters
- seed
- Random seed. Leave blank to randomize the seed
- image (required)
- Input face image
- prompt
- Input prompt
- scheduler
- Scheduler
- enable_lcm
- Enable Fast Inference with LCM (Latent Consistency Models) - speeds up inference steps, trade-off is the quality of the generated image. Performs better with close-up portrait face images
- pose_image
- (Optional) reference pose image
- num_outputs
- Number of images to output
- sdxl_weights
- Pick which base weights you want to use
- output_format
- Format of the output images
- pose_strength
- Openpose ControlNet strength, effective only if `enable_pose_controlnet` is true
- canny_strength
- Canny ControlNet strength, effective only if `enable_canny_controlnet` is true
- depth_strength
- Depth ControlNet strength, effective only if `enable_depth_controlnet` is true
- guidance_scale
- Scale for classifier-free guidance
- output_quality
- Quality of the output images, from 0 to 100. 100 is best quality, 0 is lowest quality.
- negative_prompt
- Input Negative Prompt
- ip_adapter_scale
- Scale for image adapter strength (for detail)
- lcm_guidance_scale
- Only used when `enable_lcm` is set to True, Scale for classifier-free guidance when using LCM
- num_inference_steps
- Number of denoising steps
- disable_safety_checker
- Disable safety checker for generated images
- enable_pose_controlnet
- Enable Openpose ControlNet, overrides strength if set to false
- enhance_nonface_region
- Enhance non-face region
- enable_canny_controlnet
- Enable Canny ControlNet, overrides strength if set to false
- enable_depth_controlnet
- Enable Depth ControlNet, overrides strength if set to false
- lcm_num_inference_steps
- Only used when `enable_lcm` is set to True, Number of denoising steps when using LCM
- face_detection_input_width
- Width of the input image for face detection
- face_detection_input_height
- Height of the input image for face detection
- controlnet_conditioning_scale
- Scale for IdentityNet strength (for fidelity)
Output Schema
Output
Example Execution Logs
Using seed: 25815
[~] Loading new SDXL weights: checkpoints/models--stablediffusionapi--protovision-xl-high-fidel/
Keyword arguments {'safety_checker': None} are not expected by StableDiffusionXLInstantIDPipeline and will be ignored.
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]
Loading pipeline components...: 29%|██▊ | 2/7 [00:00<00:00, 5.96it/s]
Loading pipeline components...: 43%|████▎ | 3/7 [00:00<00:01, 3.94it/s]
Loading pipeline components...: 71%|███████▏ | 5/7 [00:04<00:02, 1.15s/it]
Loading pipeline components...: 100%|██████████| 7/7 [00:04<00:00, 1.48it/s]
Loading pipeline components...: 100%|██████████| 7/7 [00:04<00:00, 1.50it/s]
[~] Seting up LCM (just in case)
Start inference...
[Debug] Prompt: analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality,
[Debug] Neg Prompt: (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch,deformed, mutated, cross-eyed, ugly, disfigured
0%| | 0/30 [00:00<?, ?it/s]
3%|▎ | 1/30 [00:00<00:07, 4.12it/s]
7%|▋ | 2/30 [00:00<00:06, 4.12it/s]
10%|█ | 3/30 [00:00<00:06, 4.13it/s]
13%|█▎ | 4/30 [00:00<00:06, 4.12it/s]
17%|█▋ | 5/30 [00:01<00:06, 4.12it/s]
20%|██ | 6/30 [00:01<00:05, 4.12it/s]
23%|██▎ | 7/30 [00:01<00:05, 4.12it/s]
27%|██▋ | 8/30 [00:01<00:05, 4.12it/s]
30%|███ | 9/30 [00:02<00:05, 4.12it/s]
33%|███▎ | 10/30 [00:02<00:04, 4.12it/s]
37%|███▋ | 11/30 [00:02<00:04, 4.12it/s]
40%|████ | 12/30 [00:02<00:04, 4.12it/s]
43%|████▎ | 13/30 [00:03<00:04, 4.12it/s]
47%|████▋ | 14/30 [00:03<00:03, 4.12it/s]
50%|█████ | 15/30 [00:03<00:03, 4.12it/s]
53%|█████▎ | 16/30 [00:03<00:03, 4.12it/s]
57%|█████▋ | 17/30 [00:04<00:03, 4.12it/s]
60%|██████ | 18/30 [00:04<00:02, 4.12it/s]
63%|██████▎ | 19/30 [00:04<00:02, 4.11it/s]
67%|██████▋ | 20/30 [00:04<00:02, 4.11it/s]
70%|███████ | 21/30 [00:05<00:02, 4.10it/s]
73%|███████▎ | 22/30 [00:05<00:01, 4.10it/s]
77%|███████▋ | 23/30 [00:05<00:01, 4.10it/s]
80%|████████ | 24/30 [00:05<00:01, 4.09it/s]
83%|████████▎ | 25/30 [00:06<00:01, 4.10it/s]
87%|████████▋ | 26/30 [00:06<00:00, 4.10it/s]
90%|█████████ | 27/30 [00:06<00:00, 4.10it/s]
93%|█████████▎| 28/30 [00:06<00:00, 4.09it/s]
97%|█████████▋| 29/30 [00:07<00:00, 4.09it/s]
100%|██████████| 30/30 [00:07<00:00, 4.09it/s]
100%|██████████| 30/30 [00:07<00:00, 4.11it/s]
NSFW content detected: False
[~] Saving to /tmp/out_0.webp...
[~] Output format: WEBP
[~] Output quality: 80
Version Details
- Version ID
2e4785a4d80dadf580077b2244c8d7c05d8e3faac04a04c02d8e099dd2876789- Version Created
- December 11, 2024