spuuntries/urna-kp3l
Caption images and answer visual questions from an image and a text prompt. Accepts an input image and an instruction (e...
Found 86 models (showing 1-20)
Caption images and answer visual questions from an image and a text prompt. Accepts an input image and an instruction (e...
Answer questions about images. Accept an image and a text prompt and return text outputs for visual question answering,...
Answer questions about images and generate captions from a single input image. Provide an image and a natural-language q...
Answer questions about images. Provide an image and a natural-language question to receive a text answer, or switch to c...
Answer questions about images and generate image-grounded text from an image and a text prompt. Perform visual question...
Answer questions about images and generate captions from an input image and a text prompt, returning text. Handle genera...
Analyze images to identify unusual or noteworthy elements based on textual prompts. This model processes an input image...
Analyze images and answer questions about them in natural language. Accepts a text prompt and an optional image and retu...
Caption images and answer questions about images. Takes an image and a text prompt as input and returns text, enabling i...
Answer questions and caption images from an input image and text prompt. Accept an image plus a natural-language query a...
Analyze images and generate text responses to prompts. Accepts an image and a text prompt, and outputs text for visual q...
Analyze images and answer questions from an input image and text instruction, returning text. Support visual question an...
Answer questions about images and generate text descriptions. Accepts an image and a natural-language prompt; returns te...
Answer questions about images from a single image input and a text prompt, returning a single-turn text response. Perfor...
Answer questions about images and generate captions from an input image and text prompt. Output free-form text grounded...
Answer questions about an image and generate captions, returning text based on visual content. Provide a single image an...
Analyze documents and images from one or more image inputs plus a text prompt, returning text captions, OCR, and answers...
Answer questions about images and generate captions from an image and a text prompt, outputting text. Perform visual que...
Answer questions about images from an image and text prompt, returning text. Perform visual question answering (VQA), im...
Generate and reason over text from prompts or multi-turn chat, with optional image inputs for vision understanding and i...