lucataco/smolvlm-instruct 🖼️📝🔢 → 📝

▶️ 3.3K runs 📅 Nov 2024 ⚙️ Cog 0.13.3 🔗 GitHub ⚖️ License
image-analysis image-captioning image-to-text ocr visual-question-answering visual-understanding

About

SmolVLM-Instruct by HuggingFaceTB

Example Output

Prompt:

"Where do the severe droughts happen according to this image?"

Output

The severe droughts happen in eastern and southern Africa.

Performance Metrics

0.50s Prediction Time
0.51s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/M41uQ4M8J9FEqxRJ0tNnliJF2PNJIeGjdid66k2uHOLgv5OJ/weather.png",
  "prompt": "Where do the severe droughts happen according to this image?",
  "max_new_tokens": 500
}
Input Parameters
image (required) Type: string
Input image to process
prompt Type: stringDefault: Can you describe this image?
Text prompt to guide the model's response
max_new_tokens Type: integerDefault: 500Range: 1 - 2000
Maximum number of tokens to generate
Output Schema

Output

Type: string

Version Details
Version ID
e79f1e0eb64fe9a145d0a0afd6127d43b37de66eaaa2e00ff3d165bc14097dfb
Version Created
November 30, 2024
Run on Replicate →