
pku-yuangroup/llava-cot
Answer questions about images with step-by-step reasoning. Take an image and an optional text prompt and output text, in...
Found 129 models (showing 121-129)
Answer questions about images with step-by-step reasoning. Take an image and an optional text prompt and output text, in...
Answer questions about images from a text prompt and an image input. Accepts an image and instruction, and returns text...
Answer questions about images and generate captions from an image and a text prompt, returning text. Perform visual ques...
Extract text and document structure from images and documents. Accepts PDF, DOC/DOCX, PPT/PPTX, and raster images and re...
Generate and chat with text from prompts or multi-turn messages, and analyze images for captions and visual Q&A. Provide...
Generates descriptive text captions from three input images using arithmetic operations on image features. The model com...
Classify images into categories. Accepts a single image and returns top predicted classes with probabilities using a Res...
Extract text from images (OCR), convert documents to Markdown, parse charts and tables, and locate content by reference...
Identify the issuing country of a national ID card from an input image. Takes a single ID card photo and outputs the pre...