zsxkib/instant-id 🔢🖼️📝❓✓ → 🖼️
About
Make realistic images of real people instantly

Example Output
Prompt:
"analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality"
Output

Performance Metrics
36.02s
Prediction Time
36.06s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/KIIutO7jIleskKaWebhvurgBUlHR6M6KN7KHaMMWSt4OnVrF/musk_resize.jpeg", "prompt": "analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality", "scheduler": "EulerDiscreteScheduler", "enable_lcm": false, "pose_image": "https://replicate.delivery/pbxt/KJmFdQRQVDXGDVdVXftLvFrrvgOPXXRXbzIVEyExPYYOFPyF/80048a6e6586759dbcb529e74a9042ca.jpeg", "num_outputs": 1, "sdxl_weights": "protovision-xl-high-fidel", "output_format": "webp", "pose_strength": 0.4, "canny_strength": 0.3, "depth_strength": 0.5, "guidance_scale": 5, "output_quality": 80, "negative_prompt": "(lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch,deformed, mutated, cross-eyed, ugly, disfigured", "ip_adapter_scale": 0.8, "lcm_guidance_scale": 1.5, "num_inference_steps": 30, "enable_pose_controlnet": true, "enhance_nonface_region": true, "enable_canny_controlnet": false, "enable_depth_controlnet": false, "lcm_num_inference_steps": 5, "face_detection_input_width": 640, "face_detection_input_height": 640, "controlnet_conditioning_scale": 0.8 }
Input Parameters
- seed
- Random seed. Leave blank to randomize the seed
- image (required)
- Input face image
- prompt
- Input prompt
- scheduler
- Scheduler
- enable_lcm
- Enable Fast Inference with LCM (Latent Consistency Models) - speeds up inference steps, trade-off is the quality of the generated image. Performs better with close-up portrait face images
- pose_image
- (Optional) reference pose image
- num_outputs
- Number of images to output
- sdxl_weights
- Pick which base weights you want to use
- output_format
- Format of the output images
- pose_strength
- Openpose ControlNet strength, effective only if `enable_pose_controlnet` is true
- canny_strength
- Canny ControlNet strength, effective only if `enable_canny_controlnet` is true
- depth_strength
- Depth ControlNet strength, effective only if `enable_depth_controlnet` is true
- guidance_scale
- Scale for classifier-free guidance
- output_quality
- Quality of the output images, from 0 to 100. 100 is best quality, 0 is lowest quality.
- negative_prompt
- Input Negative Prompt
- ip_adapter_scale
- Scale for image adapter strength (for detail)
- lcm_guidance_scale
- Only used when `enable_lcm` is set to True, Scale for classifier-free guidance when using LCM
- num_inference_steps
- Number of denoising steps
- disable_safety_checker
- Disable safety checker for generated images
- enable_pose_controlnet
- Enable Openpose ControlNet, overrides strength if set to false
- enhance_nonface_region
- Enhance non-face region
- enable_canny_controlnet
- Enable Canny ControlNet, overrides strength if set to false
- enable_depth_controlnet
- Enable Depth ControlNet, overrides strength if set to false
- lcm_num_inference_steps
- Only used when `enable_lcm` is set to True, Number of denoising steps when using LCM
- face_detection_input_width
- Width of the input image for face detection
- face_detection_input_height
- Height of the input image for face detection
- controlnet_conditioning_scale
- Scale for IdentityNet strength (for fidelity)
Output Schema
Output
Example Execution Logs
Using seed: 25815 [~] Loading new SDXL weights: checkpoints/models--stablediffusionapi--protovision-xl-high-fidel/ Keyword arguments {'safety_checker': None} are not expected by StableDiffusionXLInstantIDPipeline and will be ignored. Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s] Loading pipeline components...: 29%|██▊ | 2/7 [00:00<00:00, 5.96it/s] Loading pipeline components...: 43%|████▎ | 3/7 [00:00<00:01, 3.94it/s] Loading pipeline components...: 71%|███████▏ | 5/7 [00:04<00:02, 1.15s/it] Loading pipeline components...: 100%|██████████| 7/7 [00:04<00:00, 1.48it/s] Loading pipeline components...: 100%|██████████| 7/7 [00:04<00:00, 1.50it/s] [~] Seting up LCM (just in case) Start inference... [Debug] Prompt: analog film photo of a man. faded film, desaturated, 35mm photo, grainy, vignette, vintage, Kodachrome, Lomography, stained, highly detailed, found footage, masterpiece, best quality, [Debug] Neg Prompt: (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch,deformed, mutated, cross-eyed, ugly, disfigured 0%| | 0/30 [00:00<?, ?it/s] 3%|▎ | 1/30 [00:00<00:07, 4.12it/s] 7%|▋ | 2/30 [00:00<00:06, 4.12it/s] 10%|█ | 3/30 [00:00<00:06, 4.13it/s] 13%|█▎ | 4/30 [00:00<00:06, 4.12it/s] 17%|█▋ | 5/30 [00:01<00:06, 4.12it/s] 20%|██ | 6/30 [00:01<00:05, 4.12it/s] 23%|██▎ | 7/30 [00:01<00:05, 4.12it/s] 27%|██▋ | 8/30 [00:01<00:05, 4.12it/s] 30%|███ | 9/30 [00:02<00:05, 4.12it/s] 33%|███▎ | 10/30 [00:02<00:04, 4.12it/s] 37%|███▋ | 11/30 [00:02<00:04, 4.12it/s] 40%|████ | 12/30 [00:02<00:04, 4.12it/s] 43%|████▎ | 13/30 [00:03<00:04, 4.12it/s] 47%|████▋ | 14/30 [00:03<00:03, 4.12it/s] 50%|█████ | 15/30 [00:03<00:03, 4.12it/s] 53%|█████▎ | 16/30 [00:03<00:03, 4.12it/s] 57%|█████▋ | 17/30 [00:04<00:03, 4.12it/s] 60%|██████ | 18/30 [00:04<00:02, 4.12it/s] 63%|██████▎ | 19/30 [00:04<00:02, 4.11it/s] 67%|██████▋ | 20/30 [00:04<00:02, 4.11it/s] 70%|███████ | 21/30 [00:05<00:02, 4.10it/s] 73%|███████▎ | 22/30 [00:05<00:01, 4.10it/s] 77%|███████▋ | 23/30 [00:05<00:01, 4.10it/s] 80%|████████ | 24/30 [00:05<00:01, 4.09it/s] 83%|████████▎ | 25/30 [00:06<00:01, 4.10it/s] 87%|████████▋ | 26/30 [00:06<00:00, 4.10it/s] 90%|█████████ | 27/30 [00:06<00:00, 4.10it/s] 93%|█████████▎| 28/30 [00:06<00:00, 4.09it/s] 97%|█████████▋| 29/30 [00:07<00:00, 4.09it/s] 100%|██████████| 30/30 [00:07<00:00, 4.09it/s] 100%|██████████| 30/30 [00:07<00:00, 4.11it/s] NSFW content detected: False [~] Saving to /tmp/out_0.webp... [~] Output format: WEBP [~] Output quality: 80
Version Details
- Version ID
2e4785a4d80dadf580077b2244c8d7c05d8e3faac04a04c02d8e099dd2876789
- Version Created
- December 11, 2024