sljeff/dots.ocr
Extract structured document layout and text from an image input and return a single JSON output. Parse page elements wit...
Found 62 models (showing 21-40)
Extract structured document layout and text from an image input and return a single JSON output. Parse page elements wit...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Answer questions about images and GUI screenshots. Takes an image and a natural-language query and returns a text respon...
Caption images and answer visual questions from an input image and text prompt, returning text. Handle multilingual outp...
Convert PDFs, Office files, images, audio, HTML, and structured data to Markdown for LLM ingestion, indexing, and analys...
Convert PDFs to Markdown with optional OCR for scanned documents. Accepts a PDF and a method setting (auto, txt, or ocr)...
Answer questions about images. Takes an image and a text prompt and returns a text response, enabling visual question an...
Answer questions about images from a text prompt, returning a text response. Accepts an input image and a prompt and out...
Extract text from PDF documents. Accepts a PDF URL and returns plain text by converting each page to an image and runnin...
Extract text and document metadata from PDF pages, returning structured JSON. Accepts a PDF and page number, then transc...
Extract text and structured data from images and multi-page PDFs using visual OCR and layout analysis. Accept an image o...
Extract text from images or PDFs using OCR. Accepts an image or PDF plus a language selection (English βengβ or Arabic β...
Analyze images to generate captions, extract OCR text, detect objects, and produce segmentation masks and region proposa...
Extract text from images with optional layout-preserving HTML reconstruction. Accept a single image and output plain tex...
Extract text from images (OCR). Accepts an image and an optional language setting (default Chinese) and returns structur...
Recognizes and extracts text from images of handwritten text, outputting the detected text as a string.
Convert images or PDFs containing mathematical notation into Markdown/LaTeX text. Accept an image input and return a tex...
Extract text with pixel coordinates from images and screenshots. Accepts an image and returns readable text (markdown) p...
Caption images and perform visual question answering from an image and a text prompt, returning a text response. Choose...
Answer questions about images and generate captions from an input image and a text prompt, returning text. Handle genera...