lucataco/moondream1 🖼️📝 → 📝

▶️ 11.5K runs 📅 Jan 2024 ⚙️ Cog 0.9.3 🔗 GitHub 📄 Paper ⚖️ License

image-to-text ocr visual-question-answering visual-understanding

Performance

1.0sTypical run time

11.5KTotal runs

About

(Research only) Moondream1 is a vision language model that performs on par with models twice its size

Example Output

Prompt:

"What is the title of this book?"

Output

The Little Book of Deep Learning

Performance Metrics

0.99s Prediction Time

1.00s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/KHevN9pbiFQqC5LlI4WBzM8aoAEEMXEVcZoHy0xNAjsEVHKD/lbdl.jpg",
  "prompt": "What is the title of this book?"
}

Input Parameters

image (required) Type: string: Grayscale input image
prompt Type: string: Prompt to use for generation

Output Schema

Output

Type: array • Items Type: string

Version Details

Version ID: ecd26482e4c9220957e22290cb616200b51217fe807f61653a8459ed7541e9d5
Version Created: January 24, 2024

Run on Replicate →