lucataco/moondream1 🖼️📝 → 📝

▶️ 11.5K runs 📅 Jan 2024 ⚙️ Cog 0.9.3 🔗 GitHub 📄 Paper ⚖️ License
image-to-text ocr visual-understanding

About

(Research only) Moondream1 is a vision language model that performs on par with models twice its size

Example Output

Prompt:

"What is the title of this book?"

Output

The Little Book of Deep Learning

Performance Metrics

0.99s Prediction Time
1.00s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/KHevN9pbiFQqC5LlI4WBzM8aoAEEMXEVcZoHy0xNAjsEVHKD/lbdl.jpg",
  "prompt": "What is the title of this book?"
}
Input Parameters
image (required) Type: string
Grayscale input image
prompt Type: string
Prompt to use for generation
Output Schema

Output

Type: arrayItems Type: string

Version Details
Version ID
ecd26482e4c9220957e22290cb616200b51217fe807f61653a8459ed7541e9d5
Version Created
January 24, 2024
Run on Replicate →