bytonylee/moondream 🖼️📝✓ → 📝

▶️ 313 runs 📅 Jan 2024 ⚙️ Cog 0.8.6 🔗 GitHub ⚖️ License

image-to-text ocr visual-question-answering

Performance

4.0sTypical run time

~264sCold start (first call)

313Total runs

About

Tiny vision language model

Example Output

Prompt:

"What is the title of this book?"

Output

The Little Book of Deep Learning

Performance Metrics

3.98s Prediction Time

263.82s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/KHcvVMn21kcN62J7qUBDGmTYxjZHQ8aCm0NuaKhmYtB9j3Fh/demo-1.jpg",
  "prompt": "What is the title of this book?",
  "agree_to_research_only": true
}

Input Parameters

image (required) Type: string: Input image
prompt (required) Type: string: Input prompt
agree_to_research_only Type: booleanDefault: true: You must agree to use this model only for research. It is not for commercial use.

Output Schema

Output

Type: string

Version Details

Version ID: 5694cffb018a16076fe1a18b93ab005dc5aeb2fac76da9761ff9bb8d6bacba4d
Version Created: January 24, 2024

Run on Replicate →