ocr AI Models - Page 2 - Cloudernative

sljeff/dots.ocr

Extract structured document layout and text from an image input and return a single JSON output. Parse page elements wit...

🖼️ • ocr • document-to-json • image-object-detection • 4.8K runs

🤖 Model 🖼️ → 📝

lucataco/smolvlm-instruct

Analyzes images and generates text responses based on visual content and text prompts. Accepts arbitrary sequences of im...

🖼️ → 📝 • image-to-text • visual-understanding • document-understanding • 8.3K runs

🤖 Model 🖼️ → 📝

cjwbw/cogagent-chat

Answer questions about images and GUI screenshots. Takes an image and a natural-language query and returns a text respon...

🖼️ → 📝 • image-to-text • ocr • visual-question-answering • 2.3K runs

🤖 Model 🖼️ → 📝

lucataco/paligemma-3b-pt-224

Analyzes images and generates text responses based on prompts and visual content. Built on Google's PaliGemma 3B archite...

🖼️ → 📝 • image-to-text • visual-understanding • image-analysis • 4.1K runs

🤖 Model

cuuupid/markitdown

Convert PDFs, Office files, images, audio, HTML, and structured data to Markdown for LLM ingestion, indexing, and analys...

pdf-to-markdown • ocr • speech-to-text • 73.0K runs

🤖 Model

aodianyun/ad-pdf-extract

Convert PDFs to Markdown with optional OCR for scanned documents. Accepts a PDF and a method setting (auto, txt, or ocr)...

pdf-to-markdown • ocr • 235 runs

🤖 Model 🖼️ → 📝

jyoung105/imp

Answer questions about images. Takes an image and a text prompt and returns a text response, enabling visual question an...

🖼️ → 📝 • image-to-text • visual-question-answering • 71 runs

🤖 Model 🖼️ → 📝

jyoung105/moondream

Answer questions about images from a text prompt, returning a text response. Accepts an input image and a prompt and out...

🖼️ → 📝 • image-to-text • ocr • visual-question-answering • 310 runs

🤖 Model

vwtyler/ocr-pdf

Extract text from PDF documents. Accepts a PDF URL and returns plain text by converting each page to an image and runnin...

ocr • pdf-to-text • 2.6K runs

🤖 Model

lucataco/olmocr-7b

Extract text and document metadata from PDF pages, returning structured JSON. Accepts a PDF and page number, then transc...

ocr • document-to-json • 4.0K runs

🤖 Model 🖼️ → 📝

jigsawstack/vocr

Extract text and structured data from images and multi-page PDFs using visual OCR and layout analysis. Accept an image o...

🖼️ → 📝 • ocr • document-to-json • image-to-text • 20 runs

🤖 Model

qr2ai/img2txt

Extract text from images or PDFs using OCR. Accepts an image or PDF plus a language selection (English “eng” or Arabic “...

ocr • arabic-ocr • 34 runs

🤖 Model 🖼️ → 📝

hiscodesmells/florence-2-base

Performs multiple computer vision tasks on images including captioning, object detection, OCR, and segmentation. Takes a...

🖼️ → 📝 • image-to-text • object-detection • ocr • 323 runs

🤖 Model

jichengdu/got-ocr-2

Extract text from images with optional layout-preserving HTML reconstruction. Accept a single image and output plain tex...

ocr • layout-ocr • 307 runs

🤖 Model

hexiaochun/pp-ocr-v4

Extract text from images (OCR). Accepts an image and an optional language setting (default Chinese) and returns structur...

ocr • 441.0K runs

🤖 Model

charlesfrye/text-recognizer-gpu

Recognizes and extracts text from images of handwritten text, outputting the detected text as a string.

text-recognition • optical-character-recognition • ocr • 19.9K runs

🤖 Model 🖼️ → 📝

jd7h/texify

Convert images or PDFs containing mathematical notation into Markdown/LaTeX text. Accept an image input and return a tex...

🖼️ → 📝 • ocr • image-to-text • 72 runs

🤖 Model 🖼️ → 📝

zsxkib/easyocr

Extract text with pixel coordinates from images and screenshots. Accepts an image and returns readable text (markdown) p...

🖼️ → 📝 • ocr • image-to-text • 104 runs

🤖 Model 🖼️ → 📝

cjwbw/pix2struct

Analyzes images and answers questions or generates captions based on textual prompts. Provides six specialized models tr...

🖼️ → 📝 • image-to-text • visual-question-answering • ocr • 6.1K runs

🤖 Model 🖼️ → 📝

lucataco/moondream2

Analyzes images and generates text descriptions based on visual content and optional prompts. This small vision language...

🖼️ → 📝 • image-to-text • visual-understanding • ocr • 13.1M runs