cjwbw/blipdiffusion 🔢📝🖼️ → 🖼️
About
Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

Example Output
Prompt:
"painting by Van Gogh"
Output

Performance Metrics
3.30s
Prediction Time
98.66s
Total Time
All Input Parameters
{ "width": 512, "height": 512, "prompt": "painting by Van Gogh", "guidance_scale": 7.5, "negative_prompt": "over-exposure, under-exposure, saturated, duplicate, out of frame, lowres, cropped, worst quality, low quality, jpeg artifacts, morbid, mutilated, ugly, bad anatomy, bad proportions, deformed, blurry", "reference_image": "https://replicate.delivery/pbxt/KNZcJhVZuWiMWYReUDO2J0Up9CrBN7NmubFg2ZHADbJ5tP9c/dog.png", "num_inference_steps": 25, "source_subject_category": "dog", "target_subject_category": "dog" }
Input Parameters
- seed
- Random seed. Leave blank to randomize the seed
- prompt
- The prompt to guide the image generation.
- guidance_scale
- Scale for classifier-free guidance.
- negative_prompt
- The prompt or prompts not to guide the image generation.
- reference_image (required)
- The reference image to condition the generation on.
- num_inference_steps
- The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
- source_subject_category
- The source subject category.
- target_subject_category
- The target subject category.
Output Schema
Output
Example Execution Logs
0%| | 0/26 [00:00<?, ?it/s] 4%|▍ | 1/26 [00:00<00:03, 7.75it/s] 15%|█▌ | 4/26 [00:00<00:01, 18.80it/s] 27%|██▋ | 7/26 [00:00<00:00, 21.42it/s] 38%|███▊ | 10/26 [00:00<00:00, 22.74it/s] 50%|█████ | 13/26 [00:00<00:00, 23.47it/s] 62%|██████▏ | 16/26 [00:00<00:00, 23.89it/s] 73%|███████▎ | 19/26 [00:00<00:00, 24.06it/s] 85%|████████▍ | 22/26 [00:00<00:00, 24.28it/s] 96%|█████████▌| 25/26 [00:01<00:00, 24.07it/s] 100%|██████████| 26/26 [00:01<00:00, 22.92it/s]
Version Details
- Version ID
81a70441392af1288983861c01b09317acfc5eb5ba1343e86a2578487b26620f
- Version Created
- February 10, 2024