andreasjansson/blip-2 🖼️✓📝🔢 → 📝

▶️ 30.9M runs 📅 Feb 2023 ⚙️ Cog 0.8.3 🔗 GitHub 📄 Paper
image-captioning image-to-text visual-question-answering

About

Answers questions about images

Example Output

Output

san francisco bay

Performance Metrics

0.95s Prediction Time
1.01s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/IJEPmgAlL2zNBNDoRRKFegTEcxnlRhoQxlNjPHSZEy0pSIKn/gg_bridge.jpeg",
  "caption": false,
  "question": "what body of water does this bridge cross?",
  "temperature": 1
}
Input Parameters
image (required) Type: string
Input image to query or caption
caption Type: booleanDefault: false
Select if you want to generate image captions instead of asking questions
context Type: string
Optional - previous questions and answers to be used as context for answering current question
question Type: stringDefault: What is this a picture of?
Question to ask about this image. Leave blank for captioning
temperature Type: numberDefault: 1Range: 0.5 - 1
Temperature for use with nucleus sampling
use_nucleus_sampling Type: booleanDefault: false
Toggles the model using nucleus sampling to generate responses
Output Schema

Output

Type: string

Example Execution Logs
input for question answering: Question: what body of water does this bridge cross? Answer:
Version Details
Version ID
f677695e5e89f8b236e52ecd1d3f01beb44c34606419bcc19345e046d8f786f9
Version Created
November 20, 2023
Run on Replicate →