cjwbw/instructcv 🔢🖼️📝 → 🖼️

▶️ 359 runs 📅 Oct 2023 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License
image-object-detection image-segmentation image-to-image

About

Instruction tuned text-to-image diffusion models as vision generalists

Example Output

Output

Example output

Performance Metrics

4.09s Prediction Time
1.94s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/JfcQVKRWgew5Rzpti9VyJyG4Dfa6JfPx3xIV1wGBH6UVLbLs/pCrb5DS.jpg",
  "instruction": "Detect Berkeley's Sather tower.",
  "text_guidance_scale": 7.5,
  "image_guidance_scale": 1.5
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed
image (required) Type: string
Input image
instruction (required) Type: string
Provide an instruction outlining the specific vision task you wish InstructCV to perform
num_inference_steps Type: integerDefault: 50Range: 1 - 500
The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
text_guidance_scale Type: numberDefault: 7.5Range: 1 - 20
Scale for classifier-free guidance. Higher guidance scale encourages to generate images that are closely linked to the text prompt, usually at the expense of lower image quality.
image_guidance_scale Type: numberDefault: 1.5Range: 1 - ∞
Image guidance scale is to push the generated image towards the inital image. Higher image guidance scale encourages to generate images that are closely linked to the source image, usually at the expense of lower image quality.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 38141
  0%|          | 0/50 [00:00<?, ?it/s]
  4%|▍         | 2/50 [00:00<00:03, 14.77it/s]
 10%|█         | 5/50 [00:00<00:02, 19.05it/s]
 16%|█▌        | 8/50 [00:00<00:02, 20.39it/s]
 22%|██▏       | 11/50 [00:00<00:01, 21.02it/s]
 28%|██▊       | 14/50 [00:00<00:01, 21.35it/s]
 34%|███▍      | 17/50 [00:00<00:01, 21.57it/s]
 40%|████      | 20/50 [00:00<00:01, 21.72it/s]
 46%|████▌     | 23/50 [00:01<00:01, 21.76it/s]
 52%|█████▏    | 26/50 [00:01<00:01, 21.76it/s]
 58%|█████▊    | 29/50 [00:01<00:00, 21.80it/s]
 64%|██████▍   | 32/50 [00:01<00:00, 21.83it/s]
 70%|███████   | 35/50 [00:01<00:00, 21.83it/s]
 76%|███████▌  | 38/50 [00:01<00:00, 21.83it/s]
 82%|████████▏ | 41/50 [00:01<00:00, 21.84it/s]
 88%|████████▊ | 44/50 [00:02<00:00, 21.86it/s]
 94%|█████████▍| 47/50 [00:02<00:00, 21.86it/s]
100%|██████████| 50/50 [00:02<00:00, 21.87it/s]
100%|██████████| 50/50 [00:02<00:00, 21.49it/s]
Version Details
Version ID
3258454a0f4011005f51886a8d2c4015d5ec146652f2a28f042e5cc7e4ef85b4
Version Created
October 9, 2023
Run on Replicate →