aodianyun/minicpm-v-26
Caption images and videos. Take an image or video plus an optional prompt and return text that describes the visual cont...
Found 157 models (showing 61-80)
Caption images and videos. Take an image or video plus an optional prompt and return text that describes the visual cont...
Generate text descriptions and answers from an input image or video. Accept an optional instruction or question prompt t...
Generate text descriptions for images and videos. Accepts a single image or video plus an optional instruction prompt, a...
Answer questions about images and generate captions. Accepts an image and an optional text prompt/question, and returns...
Generate and reason over text for coding, question answering, and multi-step problem solving. Accepts text prompts or ch...
Generate and reason over text from prompts or chat messages, with optional image inputs for multimodal understanding; ou...
Generate and reason over text and code from a prompt, with optional image input for captioning and visual analysis, and...
Generate and analyze text and code from a prompt, with optional image input for visual understanding and data extraction...
Generate structured JSON or free-form text from text and image inputs. Conform outputs to JSON Schema or simple_schema,...
Caption images and answer natural-language questions about an input image. Provide an image and an optional instruction...
Answer questions about images and generate captions from an image and a text query, returning text. Accept a single imag...
Generate text from prompts or chat messages, with optional image inputs for visual understanding and captioning, and ret...
Generate and analyze text with optional image input. Accept a text prompt and an optional image and return text, support...
Generate text from text and image inputs. Perform question answering, reasoning, document summarization, data analysis,...
Generate text from prompts and optionally analyze images with captioning and visual question answering. Accepts text and...
Generate text and analyze images from a text prompt (optionally with an image), returning text for conversation, caption...
Caption images and answer visual questions from an input image and text prompt, returning text. Handle multilingual outp...
Answer questions about images and generate captions from an image input. Takes an image and a text prompt (e.g., βDescri...
Answer questions about images. Takes an image and a text prompt and returns a text response, enabling visual question an...
Caption images. Takes an input image and returns a short natural-language description as text, useful for alt text, acce...