yorickvp/llava-v1.6-mistral-7b 🖼️🔢📝 → 📝

▶️ 5.0M runs 📅 Feb 2024 ⚙️ Cog 0.9.2 🔗 GitHub ⚖️ License

image-analysis image-captioning image-to-text text-generation visual-understanding

About

LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)

Example Output

Prompt:

"What is unusual about this image?"

Output

The unusual aspect of this image is that a man is standing on the back of a yellow SUV, ironing clothes. This is not a typical scene, as one would expect to see the man either inside the vehicle or on the ground, rather than standing on the back of the SUV. The act of ironing clothes while standing on the back of a moving vehicle is both unusual and potentially dangerous.

Performance Metrics

4.45s Prediction Time

144.84s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/KKNB7w6pjN79j5pHDSyYXa5EwaQE9FL5fx6Qa83XMn1HYuKm/extreme_ironing.jpg",
  "top_p": 1,
  "prompt": "What is unusual about this image?",
  "max_tokens": 1024,
  "temperature": 0.2
}

Input Parameters

image Type: string: Input image
top_p Type: numberDefault: 1Range: 0 - 1: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
prompt (required) Type: string: Prompt to use for text generation
history Type: array: List of earlier chat messages, alternating roles, starting with user input. Include <image> to specify which message to attach the image to.
max_tokens Type: integerDefault: 1024Range: 0 - ∞: Maximum number of tokens to generate. A word is generally 2-3 tokens
temperature Type: numberDefault: 0.2Range: 0 - ∞: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic

Output Schema

Output

Type: array • Items Type: string

Example Execution Logs

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.

Version Details

Version ID: 19be067b589d0c46689ffa7cc3ff321447a441986a7694c01225973c2eafc874
Version Created: February 2, 2024

Run on Replicate →