nelsonjchen/minigpt-4_vicuna-7b 🖼️📝🔢 → 📝
About
MiniGPT-4 w/ Vicuna-7B (Image Question/Captioning Use)

Example Output
Output
The image shows a group of people standing at a train station platform, waiting for a train to arrive. The platform is made of concrete and has a metal railing on one side. There are several people in the image, all wearing different types of clothing and carrying various items such as bags and umbrellas. Some of the people are looking at their phones or talking to each other, while others are just standing around waiting. The sky is overcast and there are a few clouds visible in the distance.
Performance Metrics
9.72s
Prediction Time
9.88s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/Ii9Eo0VGYLq2KIfgz3zKMr2QEQRl9n4a45E910Ctu4btAoxY/pexels-alexandra-folster-6307862.jpg", "message": "Please describe the image.", "num_beams": 10, "temperature": 1, "max_new_tokens": 500 }
Input Parameters
- image (required)
- Input image to discuss
- message
- Message to send to MiniGPT-4.
- num_beams
- beam search numbers. More beams require more VRAM.
- temperature
- Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
- max_new_tokens
- Maximum number of tokens to generate. A word is generally 2-3 tokens (minimum: 1)
Output Schema
Output
Version Details
- Version ID
4786ec1df0917116b1f58996d93a25b06e8d4dc1641da94a7ac6b97e26cd6394
- Version Created
- April 25, 2023