ibm-granite/granite-vision-3.3-2b 🔢🖼️📝 → 📝

⭐ Official ▶️ 38.9K runs 📅 Jul 2025 ⚙️ Cog 0.16.1 🔗 GitHub 📄 Paper ⚖️ License
image-analysis image-captioning image-to-text ocr visual-question-answering

About

Granite-vision-3.3-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

Example Output

Prompt:

"Describe the image"

Output

Image Description

The image consists of a detailed diagram of a vehicle's rearview mirror, along with an adjacent table listing the pin descriptions for the rearview mirror.

Rearview Mirror Diagram

  1. Connector A: This is the main connection point on the rearview mirror, where it interfaces with the vehicle's electrical system.
  2. LED Indicator: Located above the mirror, this LED light illuminates when the rearview mirror is in use.
  3. Reward Looking Sensor: This sensor is used to assist the rearview mirror in providing a clear view by detecting the presence of objects behind the vehicle.
  4. Forward Looking Sensor: This sensor is used to assist in the vehicle's forward vision, ensuring that the rearview mirror can provide a clear view of the area behind the vehicle.
  5. ON/OFF Switch: Located on the rearview mirror, this switch turns the mirror on and off.

Pin Description Table

The table provides a detailed pinout for the rearview mirror, with corresponding descriptions:

Pin Description
1 Key power
2 OEC(+)
3 Key out
4 -
5 -
6 Battery
7 OEC(-)
8 GND
9 Reverse signal
10 IGN (+)

Integrated Knowledge and Analysis

Rearview Mirror Functionality

A rearview mirror is a crucial component of a vehicle, primarily used by the driver to view the area behind the vehicle. It is essential for safe driving, as it helps the driver to monitor traffic conditions and make informed decisions regarding lane changes, turns, and stops. The rearview mirror can be manually operated or automatically activated via a turn signal, which is controlled by the vehicle's electrical system.

Pin Description Explanation

  1. Pin 1 (Key power): This pin is connected to the vehicle's power supply and is used to activate the rearview mirror.
  2. Pin 2 (OEC(+)): This pin is connected to the OEC (On

Performance Metrics

6.33s Prediction Time
267.68s Total Time
All Input Parameters
{
  "top_k": 50,
  "top_p": 0.9,
  "images": [
    "https://replicate.delivery/pbxt/NMB7MaGMogCUtggblf8x4k15iviuAVwXYg1TcbzgD3jAruW5/hyundai%20rear%20view%20mirror%20power%20schematic.jpg"
  ],
  "prompt": "Describe the image",
  "max_tokens": 512,
  "min_tokens": 0,
  "temperature": 0.6,
  "presence_penalty": 0,
  "frequency_penalty": 0
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed.
image Type: string
Deprecated single image input.Use images input instead.Ignored if images used.
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.9
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
images Type: array
Image inputs for the model.
prompt Type: stringDefault:
User prompt to send to the model.
max_tokens Type: integerDefault: 512
The maximum number of tokens the model should generate as output.
min_tokens Type: integerDefault: 0
The minimum number of tokens the model should generate as output.
temperature Type: numberDefault: 0.6
The value used to modulate the next token probabilities.
chat_template Type: string
A template to format the prompt with. If not provided, the default prompt template will be used.
system_prompt Type: string
System prompt to send to the model.The chat template provides a good default.
stop_sequences Type: string
A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
presence_penalty Type: numberDefault: 0
Presence penalty
frequency_penalty Type: numberDefault: 0
Frequency penalty
Output Schema

Output

Type: arrayItems Type: string

Example Execution Logs
2025-07-14 14:05:28 [info     ] predict() commencing           request_id=1 user_prompt=Describe the image
2025-07-14 14:05:28 [debug    ] Formatted prompt using chat template formatted_prompt=<|system|>
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
<|user|>
<image>
Describe the image
<|assistant|>
 request_id=1 user_prompt=Describe the image
2025-07-14 14:05:28 [debug    ] SamplingParams                 request_id=1 sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, seed=None, stop=[], stop_token_ids=[0], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=512, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None) user_prompt=Describe the image
INFO 07-14 14:05:29 [async_llm.py:270] Added request 1.
2025-07-14 14:05:34 [debug    ] result                         finish_reason=length request_id=1 text=### Image Description

The image consists of a detailed diagram of a vehicle's rearview mirror, along with an adjacent table listing the pin descriptions for the rearview mirror.

#### Rearview Mirror Diagram

1. **Connector A**: This is the main connection point on the rearview mirror, where it interfaces with the vehicle's electrical system.
2. **LED Indicator**: Located above the mirror, this LED light illuminates when the rearview mirror is in use.
3. **Reward Looking Sensor**: This sensor is used to assist the rearview mirror in providing a clear view by detecting the presence of objects behind the vehicle.
4. **Forward Looking Sensor**: This sensor is used to assist in the vehicle's forward vision, ensuring that the rearview mirror can provide a clear view of the area behind the vehicle.
5. **ON/OFF Switch**: Located on the rearview mirror, this switch turns the mirror on and off.

#### Pin Description Table

The table provides a detailed pinout for the rearview mirror, with corresponding descriptions:

| Pin | Description         |
|-----|---------------------|
| 1   | Key power           |
| 2   | OEC(+)              |
| 3   | Key out             |
| 4   | -                   |
| 5   | -                   |
| 6   | Battery             |
| 7   | OEC(-)              |
| 8   | GND                 |
| 9   | Reverse signal     |
| 10  | IGN (+)             |

### Integrated Knowledge and Analysis

#### Rearview Mirror Functionality

A rearview mirror is a crucial component of a vehicle, primarily used by the driver to view the area behind the vehicle. It is essential for safe driving, as it helps the driver to monitor traffic conditions and make informed decisions regarding lane changes, turns, and stops. The rearview mirror can be manually operated or automatically activated via a turn signal, which is controlled by the vehicle's electrical system.

#### Pin Description Explanation

1. **Pin 1 (Key power)**: This pin is connected to the vehicle's power supply and is used to activate the rearview mirror.
2. **Pin 2 (OEC(+))**: This pin is connected to the OEC (On user_prompt=Describe the image
2025-07-14 14:05:34 [info     ] Generation took 6.17s          request_id=1 user_prompt=Describe the image
/root/.pyenv/versions/3.12.11/lib/python3.12/site-packages/cog/server/scope.py:22: ExperimentalFeatureWarning: current_scope is an experimental internal function. It may change or be removed without warning.
  warnings.warn(
2025-07-14 14:05:34 [info     ] predict() complete             request_id=1 user_prompt=Describe the image
Version Details
Version ID
3339e8453ca94104383f6f085a511d7f26cca2d0cab2f6018986737b6cf7d391
Version Created
July 29, 2025
Run on Replicate →