daanelson/minigpt-4 🖼️🔢📝 → 📝

▶️ 1.8M runs 📅 May 2023 ⚙️ Cog v0.7.0-beta17+dev 🔗 GitHub 📄 Paper ⚖️ License

image-captioning image-to-text visual-question-answering visual-understanding

About

A model which generates text in response to an input image and prompt.

Example Output

Prompt:

"This llama's name is Dave. Write me a story about how Dave found his skateboard."

Output

Dave the llama was feeling very bored one day. He had been wandering around the city for hours, but there was nothing interesting to do. Suddenly, he saw a skateboard lying on the ground. He decided to try it out, and as soon as he started riding it, he felt a rush of excitement. He rode around the city, enjoying the feeling of the wind in his hair and the freedom of being on his own. As he rode, he saw all sorts of interesting things that he had never noticed before. He even met some new friends along the way. After a while, Dave realized that he had found his true passion - skateboarding. From then on, he spent all his free time riding his skateboard and exploring the city.

Performance Metrics

16.97s Prediction Time

16.94s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/IqG1MbemhULihtfr62URRZbI29XtcPsnOYASrTDQ6u5oSqv9/llama_13b.png",
  "top_p": 0.9,
  "prompt": "This llama's name is Dave. Write me a story about how Dave found his skateboard.",
  "num_beams": 5,
  "max_length": 4000,
  "temperature": 1.32,
  "max_new_tokens": 3000,
  "repetition_penalty": 1
}

Input Parameters

image (required) Type: string: Image to discuss
top_p Type: numberDefault: 0.9Range: 0 - 1: Sample from the top p percent most likely tokens
prompt (required) Type: string: Prompt for mini-gpt4 regarding input image
num_beams Type: integerDefault: 3Range: 1 - 10: Number of beams for beam search decoding
max_length Type: integerDefault: 4000Range: 1 - ∞: Total length of prompt and output in tokens
temperature Type: numberDefault: 1Range: 0.01 - 2: Temperature for generating tokens, lower = more predictable results
max_new_tokens Type: integerDefault: 3000Range: 1 - ∞: Maximum number of new tokens to generate
repetition_penalty Type: numberDefault: 1Range: 0.01 - 5: Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

Output Schema

Output

Type: string

Version Details

Version ID: e447a8583cffd86ce3b93f9c2cd24f2eae603d99ace6afa94b33a08e94a3cd06
Version Created: May 14, 2024

Run on Replicate →