nelsonjchen/minigpt-4_vicuna-13b 🖼️📝🔢 → 📝

▶️ 52.0K runs 📅 Apr 2023 ⚙️ Cog 0.6.1 🔗 GitHub ⚖️ License
image-captioning image-to-text visual-question-answering

About

MiniGPT-4 w/ Vicuna-13B (Image Question/Captioning Use)

Example Output

Output

This photo is funny because it shows a group of men in suits standing in front of a mirror, looking at themselves. The man in the middle is wearing a suit and tie, while the other men are wearing suits and ties as well. They all appear to be looking at themselves in the mirror, which adds to the humor of the photo. The fact that they are all dressed up in suits and ties, but standing in front of a bathroom mirror, is ironic and adds to the humor of the photo.

Performance Metrics

12.59s Prediction Time
807.36s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/Iipg4ffvshGdlH9EYNf8akUGjcAKFzvfpK1nnVbCkJbxBhR4/3F7668BD-41F3-43D3-813E-068EFEEAC67B.jpeg",
  "message": "Why is this photo funny?",
  "num_beams": 10,
  "temperature": 1,
  "max_new_tokens": 500
}
Input Parameters
image (required) Type: string
Input image to discuss
message Type: stringDefault: Please describe the image.
Message to send to the bot.
num_beams Type: integerDefault: 1Range: 1 - 10
beam search numbers
temperature Type: numberDefault: 1Range: 0.1 - 2
temperature
Output Schema

Output

Type: string

Version Details
Version ID
47a8e0a09e1e99bb784f3aa6277b5abb0917233ba74e6937eb447fd0bba7fd5a
Version Created
April 24, 2023
Run on Replicate →