ibm-granite/granite-vision-3.2-2b 🔢📝 → 📝

⭐ Official ▶️ 118.0K runs 📅 Feb 2025 ⚙️ Cog 0.10.0-alpha27 📄 Paper ⚖️ License
image-to-text ocr visual-question-answering

About

Granite-Vision-3.2-2B is a compact and efficient vision-language model, specifically designed for visual document understanding.

Example Output

Prompt:

"Describe this image"

Output

The image depicts a detailed view of the Moon, showcasing its surface with remarkable clarity and detail. The Moon appears to be captured in a black-and-white photograph, which enhances the contrast between the lunar surface and the dark background. The lunar surface is dotted with numerous craters, which are small, circular depressions formed by meteorite impacts. These craters vary in size, with some being relatively small and others significantly larger.

The central region of the Moon exhibits a distinct bright spot, known as the "dark side" or "far side," which is not illuminated by the Sun. This dark area is due to the Moon's lack of a global magnetic field, which means it does not have a strong magnetic field to deflect solar wind and cosmic radiation. The bright side of the Moon, on the other hand, is brightly illuminated by the Sun, creating a stark contrast between the lit and shadowed areas.

The Moon's surface is rugged and uneven, with numerous ridges and valleys. These features are formed by ancient volcanic activity and tectonic movements that have shaped the Moon over billions of years. The Moon's surface is composed primarily of rock and dust, with some areas displaying a more porous texture, which reflects light differently.

In the central region, there is a prominent central peak or mountain, which is likely a result of ancient impacts and volcanic activity. The lunar craters vary in size and distribution, with some appearing closer to the center of the Moon and others scattered more towards the edges. The distribution of craters is not uniform, indicating that the Moon has been geologically active over a long period.

The photograph captures the Moon's surface in a way that highlights its geological diversity and the impacts that have shaped its history. The lack of atmosphere and magnetic field on the Moon means that it does not experience weathering or erosion, which allows for the preservation of its ancient features.

In summary, the image provides a detailed view of the Moon's surface, showcasing its rugged and varied terrain, numerous craters, and the contrast between the illuminated and dark sides. The lack of atmosphere

Performance Metrics

3.07s Prediction Time
3.07s Total Time
All Input Parameters
{
  "image": "https://upload.wikimedia.org/wikipedia/commons/e/e1/FullMoon2010.jpg",
  "top_p": 0.9,
  "prompt": "Describe this image",
  "max_tokens": 512,
  "temperature": 0.6,
  "system_prompt": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."
}
Input Parameters
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt Type: stringDefault:
Prompt
max_tokens Type: integerDefault: 512
The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0
The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
system_prompt Type: stringDefault: You are a helpful assistant.
System prompt to send to the model. This is prepended to the prompt and helps guide system behavior. Ignored for non-chat models.
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
INFO:     127.0.0.1:36560 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 02-26 14:37:51 metrics.py:455] Avg prompt throughput: 1011.3 tokens/s, Avg generation throughput: 0.1 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.5%, CPU KV cache usage: 0.0%.
Generation took 2.94sFormatted prompt: Describe this image
Version Details
Version ID
2dd748835ec71ee87ec3f6d7d63aff63b276f0cf903c5afcaf57f9f3ca99d7a8
Version Created
March 5, 2025
Run on Replicate →