smartinezbragado/salesforce-blip2 🖼️✓📝 → 📝

▶️ 967 runs 📅 Jun 2023 ⚙️ Cog 0.7.2 🔗 GitHub 📄 Paper
image-captioning image-to-text visual-question-answering

About

BLIP2 model trained on blip2-flan-t5-xl-coco dataset

Example Output

Output

a bunch of coconuts with the shells cut off

Performance Metrics

4.36s Prediction Time
496.31s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/J4Q1WGe34BzyWARHpIrNeTco30ZFlGCateRidPSKM22OukMH/cocos.jpeg",
  "caption": true,
  "question": "What is this a picture of?"
}
Input Parameters
image (required) Type: string
Input image to query or caption
caption Type: booleanDefault: false
Select if you want to generate image captions instead of asking questions
context Type: string
Optional - previous questions and answers to be used as context for answering current question
question Type: stringDefault: What is this a picture of?
Question to ask about this image. Leave blank for captioning
Output Schema

Output

Type: string

Example Execution Logs
/root/.pyenv/versions/3.8.17/lib/python3.8/site-packages/transformers/generation/utils.py:1353: UserWarning: Using `max_length`'s default (20) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
Version Details
Version ID
49a5561c65047cff04000654f0e217b02f4656e324f08299c560b89c512da2be
Version Created
June 26, 2023
Run on Replicate →