sai88uk/minicpm-v-45-v9
Answer questions about images and videos, perform OCR, and describe scenes, returning text. Accepts an image or a video...
Found 86 models (showing 61-80)
Answer questions about images and videos, perform OCR, and describe scenes, returning text. Accepts an image or a video...
Generate and reason over text from prompts or chat messages, with optional image inputs for multimodal understanding; ou...
Analyze images and return text responses for captioning and visual question answering. Accept an image and a natural-lan...
Generate text from text and image inputs. Perform question answering, reasoning, document summarization, data analysis,...
Caption images and answer visual questions from an input image. Provide an image and either generate a caption or ask a...
Answer questions about images with step-by-step reasoning. Take an image and an optional text prompt and output text, in...
Answer questions about images from an image and text prompt, returning text responses. Perform visual question answering...
Generate structured JSON or free-form text from prompts, with optional web search and tool use. Accepts text and images...
Generates detailed textual descriptions of images based on input prompts. Utilizes a vision-language model to analyze an...
Answer questions about images and generate captions from an image and a text prompt, returning text. Perform visual ques...
Generates descriptive text captions from three input images using arithmetic operations on image features. The model com...
Generate image captions from a single image input. Select from COCO or Conceptual Captions modes and optionally use beam...
Caption images. Accepts an image input and generates a zero-shot natural-language description, optionally conditioned by...
Generate and chat in natural language from text prompts, with optional image inputs for visual understanding and image-t...
Generate and analyze text and code from prompts and images. Accepts chat messages and optional image inputs and returns...
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Generate text from prompts and optionally analyze images with captioning and visual question answering. Accepts text and...
Analyze images or video and generate text captions, answers, and summaries. Accepts single or multiple images or a video...
Caption images and answer visual questions from an input image, returning text. Accept an image and an optional instruct...
Caption images and videos and answer visual questions. Accepts an optional image or video plus a text prompt and returns...