yorickvp/llava-v1.6-mistral-7b 🖼️🔢📝 → 📝

▶️ 4.9M runs 📅 Feb 2024 ⚙️ Cog 0.9.2 🔗 GitHub ⚖️ License
image-analysis image-captioning image-to-text text-generation visual-understanding

About

LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)

Example Output

Prompt:

"What is unusual about this image?"

Output

The unusual aspect of this image is that a man is standing on the back of a yellow SUV, ironing clothes. This is not a typical scene, as one would expect to see the man either inside the vehicle or on the ground, rather than standing on the back of the SUV. The act of ironing clothes while standing on the back of a moving vehicle is both unusual and potentially dangerous.

Performance Metrics

4.45s Prediction Time
144.84s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/KKNB7w6pjN79j5pHDSyYXa5EwaQE9FL5fx6Qa83XMn1HYuKm/extreme_ironing.jpg",
  "top_p": 1,
  "prompt": "What is unusual about this image?",
  "max_tokens": 1024,
  "temperature": 0.2
}
Input Parameters
image Type: string
Input image
top_p Type: numberDefault: 1Range: 0 - 1
When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
prompt (required) Type: string
Prompt to use for text generation
history Type: array
List of earlier chat messages, alternating roles, starting with user input. Include <image> to specify which message to attach the image to.
max_tokens Type: integerDefault: 1024Range: 0 - ∞
Maximum number of tokens to generate. A word is generally 2-3 tokens
temperature Type: numberDefault: 0.2Range: 0 - ∞
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Version Details
Version ID
19be067b589d0c46689ffa7cc3ff321447a441986a7694c01225973c2eafc874
Version Created
February 2, 2024
Run on Replicate →