smartinezbragado/salesforce-blip2 🖼️✓📝 → 📝

▶️ 967 runs 📅 Jun 2023 ⚙️ Cog 0.7.2 🔗 GitHub 📄 Paper

image-captioning image-to-text visual-question-answering

About

BLIP2 model trained on blip2-flan-t5-xl-coco dataset

Example Output

Output

a bunch of coconuts with the shells cut off

Performance Metrics

4.36s Prediction Time

496.31s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/J4Q1WGe34BzyWARHpIrNeTco30ZFlGCateRidPSKM22OukMH/cocos.jpeg",
  "caption": true,
  "question": "What is this a picture of?"
}

Input Parameters

image (required) Type: string: Input image to query or caption
caption Type: booleanDefault: false: Select if you want to generate image captions instead of asking questions
context Type: string: Optional - previous questions and answers to be used as context for answering current question
question Type: stringDefault: What is this a picture of?: Question to ask about this image. Leave blank for captioning

Output Schema

Output

Type: string

Example Execution Logs

/root/.pyenv/versions/3.8.17/lib/python3.8/site-packages/transformers/generation/utils.py:1353: UserWarning: Using `max_length`'s default (20) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
warnings.warn(

Version Details

Version ID: 49a5561c65047cff04000654f0e217b02f4656e324f08299c560b89c512da2be
Version Created: June 26, 2023

Run on Replicate →