ocr AI Models - Page 3 - Cloudernative

chenxwh/deepseek-vl2

Analyze images and answer questions about visual content using a Mixture-of-Experts vision-language model. Processes sin...

🖼️ → 📝 • image-to-text • ocr • visual-understanding • 1.1K runs

🤖 Model 🖼️ → 📝

lucataco/minicpm-v-4

Analyze images and videos with text prompts to generate detailed text responses. Handles single images, multiple images,...

🖼️ → 📝 • image-to-text • video-to-text • visual-understanding • 795 runs

🤖 Model 🖼️ → 📝

lucataco/florence-2-base

Performs multiple vision and vision-language tasks based on text prompts. Supports image captioning with varying detail...

🖼️ → 📝 • image-to-text • object-detection • ocr • 133.5K runs

🤖 Model 🖼️ → 📝

lucataco/ollama-llama3.2-vision-90b

Generates text responses based on image and text inputs using Meta's Llama 3.2-Vision 90B multimodal language model. Per...

🖼️ → 📝 • image-to-text • text-generation • visual-understanding • 4.6K runs

🤖 Model 🖼️ → 📝

lucataco/florence-2-large

Analyze images to generate captions, detect objects, and extract text (OCR). Accepts an image plus a task selector and o...

🖼️ → 📝 • image-to-text • image-object-detection • ocr • 471.5K runs

🤖 Model 🖼️ → 📝

deepseek-ai/deepseek-vl2-small

Analyze images and answer questions about visual content using a Mixture-of-Experts vision-language model. Takes an imag...

🖼️ → 📝 • image-to-text • ocr • text-generation • 6.5K runs

🤖 Model 🖼️ → 📝

lucataco/ollama-llama3.2-vision-11b

Generates text responses based on both text prompts and images using Meta's Llama 3.2 Vision 11B model. Analyzes and und...

🖼️ → 📝 • image-to-text • text-generation • visual-understanding • 9.9K runs

🤖 Model 🖼️ → 📝

microsoft/omniparser-v2

Parse GUI screenshots into structured UI elements with bounding boxes and captions. Accepts an image of a desktop or mob...

🖼️ → 📝 • image-to-text • image-object-detection • ui-parsing • 185.5K runs

🤖 Model

sulthonmb/ocr-receipt

Extract structured purchase data from receipt images as JSON. Input a receipt image and output JSON with line items, qua...

ocr • document-to-json • receipt-parsing • 675 runs

🤖 Model

datalab-to/ocr

Extract text from images and documents in 90+ languages with OCR, returning plain text plus optional structured layout....

ocr • layout-analysis • table-recognition • 426 runs

🤖 Model

datalab-to/marker

Convert documents to Markdown and structured JSON. Accept PDF, DOC/DOCX, PPT/PPTX, and image files (PNG/JPG/WEBP) as inp...

pdf-to-markdown • document-to-json • ocr • 361 runs

🤖 Model

lucataco/deepseek-ocr

Converts images containing documents, PDFs, charts, and handwritten text into structured markdown while preserving forma...

ocr • pdf-to-markdown • document-to-json • 93.6K runs

🤖 Model 🖼️ → 📝

lucataco/qwen-vl-chat

Analyzes images and answers questions about them through conversational interaction. Takes an image and a text prompt as...

🖼️ → 📝 • image-to-text • text-generation • visual-understanding • 826.4K runs

🤖 Model 🖼️ → 📝

nvidia/nemotron-nano-v2-12b-vl

Analyzes images and videos to answer questions, extract data, and provide detailed descriptions. Supports processing up...

🖼️ → 📝 • image-to-text • video-to-text • document-to-json • 988 runs

🤖 Model 🖼️ → 📝

ghostljj/deepseek-ocr

Extract text and convert documents to markdown format from images using optical character recognition. Supports multiple...

🖼️ → 📝 • ocr • pdf-to-markdown • document-to-json • 92 runs

🤖 Model 🖼️ → 📝

eiby777/olmocr-2-7b-1025-fp8

Extract text and tables from document images or PDFs. Accepts an image or a selected PDF page and returns structured tex...

🖼️ → 📝 • ocr • image-to-text • 10 runs

🤖 Model 🖼️ → 📝

lucataco/blip3-phi3-mini-instruct-r-v1

Answer questions about images and generate captions from an image input and a natural-language question, returning text....

🖼️ → 📝 • image-to-text • ocr • visual-question-answering • 399 runs

🤖 Model 🖼️ → 📝

perceptron-ai-inc/isaac-0.1

Analyzes images and answers questions about visual content with spatially-aware responses. Takes an image and a text pro...

🖼️ → 📝 • image-to-text • visual-understanding • ocr • 39.1K runs

🤖 Model 🖼️ → 📝

jyoung105/honeybee

Analyzes images and generates text responses based on both the image content and text prompts. Uses a locality-enhanced...

🖼️ → 📝 • image-to-text • text-generation • visual-understanding • 30 runs

🤖 Model 🖼️ → 📝

chenxwh/cogvlm2

Caption images and answer visual questions from an image plus an optional text prompt, returning text. Handle OCR-style...

🖼️ → 📝 • image-to-text • ocr • visual-question-answering • 6.6K runs