sljeff/dots.ocr
Extract document layout and text from an image into structured JSON. Accepts a scanned page or document image and return...
Found 49 models (showing 21-40)
Extract document layout and text from an image into structured JSON. Accepts a scanned page or document image and return...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Answer questions about images and GUI screenshots, returning text responses or, for UI tasks, a step-by-step plan with n...
Caption images and answer visual questions from an image plus a text prompt, returning text. Support multilingual prompt...
Convert PDFs, Office documents, images, audio, HTML, and other files to clean Markdown text. Accepts PDF, DOCX, PPTX, XL...
Convert PDFs to Markdown text. Accepts a PDF input and extracts content using the embedded text layer (txt) or optical c...
Answer questions about images. Accept an image and a natural-language prompt and return text, enabling visual question a...
Answer questions about images from an image and text prompt, returning text. Support visual question answering, image ca...
Extract text from PDF documents using OCR, returning plain text. Accepts a PDF as input, converts each page to an image,...
Extract text and layout metadata from PDF pages. Takes a PDF and page number as input and returns a structured JSON stri...
Extract text and structured data from images and multi-page PDFs. Provide an image or PDF plus a prompt string for scene...
Extract text from images and PDFs using OCR. Accepts an image or PDF input and returns the recognized text as plain text...
Analyze images to caption content, detect objects, segment regions, and extract text (OCR). Accepts an image and an opti...
Extract text from images with OCR. Accepts a single image input and returns the recognized text; when the format option...
Extract text from images (OCR). Accepts an image and an optional language code (default Chinese) and returns recognized...
Recognizes and extracts text from images of handwritten text, outputting the detected text as a string.
Extract LaTeX-formatted math from images or PDFs and return Markdown text. Takes an image (or PDF) containing equations...
Extract on-screen text with pixel coordinates from images and screenshots. Takes an image as input and returns readable...
Answer questions about images and documents and generate captions from an image plus a text prompt, returning text. Sele...
Caption images and answer visual questions from an input image and text prompt. Accepts an image and a natural-language...