aodianyun/minicpm-v-26
Caption images and videos. Accepts an image or video plus an optional prompt and returns text descriptions or summaries...
Found 125 models (showing 61-80)
Caption images and videos. Accepts an image or video plus an optional prompt and returns text descriptions or summaries...
Generate captions, answers, and summaries from an input image or video. Accept an image or video plus an optional prompt...
Generate text descriptions for images and videos. Accepts a single image or video plus an optional instruction prompt, a...
Answer questions about images and generate captions from an input image and text prompt. Accepts an image plus a message...
Generate and reason over text for chat, coding, and complex problem solving. Accept prompts or multi-turn messages and o...
Generate text from prompts or chat messages, with optional image inputs for multimodal reasoning and captioning. Accept...
Generate text from prompts with optional image analysis and captioning. Takes a text prompt (and optionally an image) an...
Generate and reason over text from a prompt and optional image. Support code generation, code understanding, front-end w...
Generate structured JSON or free-form text from prompts. Accept text and image inputs to analyze visuals and return capt...
Caption images and answer visual questions from an image and a text prompt. Accepts an image and a message (question or...
Answer questions about images and generate captions from an image and a natural-language query. Takes an image plus a te...
Solve complex reasoning tasks and generate text from prompts or chat messages. Accept text or messages and optional imag...
Generate text and understand images from text and optional image inputs. Handle chat, question answering, document summa...
Generate text and analyze images from text and image inputs. Handle question answering, longβcontext reasoning (128K tok...
Generate text from prompts and optionally analyze images to produce text outputs. Accepts text and optional image inputs...
Generate text and answer questions about images from a text prompt, returning text outputs for chat, captioning, and vis...
Caption images and answer visual questions from an image plus a text prompt, returning text. Support multilingual prompt...
Caption images and answer questions about images. Accepts an image and a natural-language prompt (e.g., βDescribe this i...
Answer questions about images. Accept an image and a natural-language prompt and return text, enabling visual question a...
Caption images from a single image input, returning a natural-language description of the scene. Accepts one image and o...