tgohblio/instant-id-multicontrolnet ✓🔢❓📝🖼️ → 🖼️

▶️ 321.1K runs 📅 Feb 2024 ⚙️ Cog 0.9.3 🔗 GitHub 📄 Paper ⚖️ License
image-consistent-character-generation image-to-image

About

InstantID. ControlNets. More base SDXL models. And the latest ByteDance's ⚡️SDXL-Lightning !⚡️

Example Output

Prompt:

"woman as elven princess, with blue sheen dress, masterpiece"

Output

Example output

Performance Metrics

9.76s Prediction Time
1467.55s Total Time
All Input Parameters
{
  "pose": false,
  "seed": 0,
  "canny": false,
  "model": "AlbedoBase XL V2",
  "prompt": "woman as elven princess, with blue sheen dress, masterpiece",
  "depth_map": false,
  "num_steps": 25,
  "scheduler": "DPMSolverMultistepScheduler",
  "pose_strength": 0.5,
  "canny_strength": 0.5,
  "depth_strength": 0.5,
  "guidance_scale": 7,
  "safety_checker": true,
  "face_image_path": "https://replicate.delivery/pbxt/KRsl57SjTUo1WOBw1ir3UVI06jpQ7ybyEtdprpqF2qja40Wn/halle-berry.jpeg",
  "lightning_steps": "4step",
  "negative_prompt": "ugly, low quality, deformed face, nsfw",
  "enable_fast_mode": true,
  "adapter_strength_ratio": 0.8,
  "enhance_non_face_region": true,
  "identitynet_strength_ratio": 0.8
}
Input Parameters
pose Type: booleanDefault: false
Use pose for skeleton inference
seed Type: integerDefault: 0Range: 0 - 2147483647
Seed number. Set to non-zero to make the image reproducible.
canny Type: booleanDefault: false
Use canny for edge detection
model Default: AlbedoBase XL V2
Select SDXL model
prompt Type: stringDefault: a person
Input prompt
depth_map Type: booleanDefault: false
Use depth for depth map estimation
num_steps Type: integerDefault: 25Range: 1 - 50
Number of denoising steps. If enable fast mode, this is not used.
scheduler Default: DPMSolverMultistepScheduler
Scheduler options. If enable fast mode, this is not used.
pose_strength Type: numberDefault: 1Range: 0 - 1.5
canny_strength Type: numberDefault: 0.5Range: 0 - 1.5
depth_strength Type: numberDefault: 0.5Range: 0 - 1.5
guidance_scale Type: numberDefault: 7Range: 0 - 10
Scale for classifier-free guidance. Optimum is 4-8. If enable fast mode, this is not used.
safety_checker Type: booleanDefault: true
Safety checker is enabled by default. Un-tick to expose unfiltered results.
face_image_path (required) Type: string
Image of your face
lightning_steps Default: 4step
if enable fast mode, choose number of denoising steps
negative_prompt Type: stringDefault: (lowres, low quality, worst quality:1.2), (text:1.2), watermark, glitch, deformed, mutated, cross-eyed, ugly, disfigured, blurry, grainy
Input negative prompt
pose_image_path Type: string
Reference pose image
enable_fast_mode Type: booleanDefault: true
Enable SDXL-lightning fast inference. If pose, canny or depth map is used, disable it for better quality images.
adapter_strength_ratio Type: numberDefault: 0.8Range: 0 - 1
Image adapter strength (for detail)
enhance_non_face_region Type: booleanDefault: true
Enhance non-face region
identitynet_strength_ratio Type: numberDefault: 0.8Range: 0 - 1
IdentityNet strength (for fidelity)
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
/root/.pyenv/versions/3.11.9/lib/python3.11/site-packages/insightface/utils/transform.py:68: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
  0%|          | 0/4 [00:00<?, ?it/s]
 25%|██▌       | 1/4 [00:00<00:02,  1.36it/s]
 50%|█████     | 2/4 [00:00<00:00,  2.32it/s]
 75%|███████▌  | 3/4 [00:01<00:00,  2.99it/s]
100%|██████████| 4/4 [00:01<00:00,  3.45it/s]
100%|██████████| 4/4 [00:01<00:00,  2.87it/s]
Version Details
Version ID
35324a7df2397e6e57dfd8f4f9d2910425f5123109c8c3ed035e769aeff9ff3c
Version Created
May 5, 2024
Run on Replicate →