spuuntries/urna-kp3l
Caption images and answer visual questions from an image and a text prompt. Accepts an input image and an instruction (e...
Found 90 models (showing 1-20)
Caption images and answer visual questions from an image and a text prompt. Accepts an input image and an instruction (e...
Analyzes images and answers questions about them through conversational interaction. Takes an image and a text prompt as...
Answer questions about images and generate captions from a single input image. Provide an image and a natural-language q...
Answer questions about images. Provide an image and a natural-language question to receive a text answer, or switch to c...
Answer questions about images and generate image-grounded text from an image and a text prompt. Perform visual question...
Answer questions about images and generate captions from an input image and a text prompt, returning text. Handle genera...
Analyze images to identify unusual or noteworthy elements based on textual prompts. This model processes an input image...
Analyze images and answer questions about them in natural language. Accepts a text prompt and an optional image and retu...
Caption images and answer questions about images. Takes an image and a text prompt as input and returns text, enabling i...
Analyzes images and generates text descriptions or answers questions about visual content. Uses a projection module trai...
Answer questions about images and perform visual reasoning from an image and a text prompt, returning text. Handle visua...
Analyze images and answer questions from an input image and text instruction, returning text. Support visual question an...
Answer questions about images and generate text descriptions. Accepts an image and a natural-language prompt; returns te...
Analyzes images and answers questions about visual content with enhanced reasoning capabilities. Takes an image and text...
Analyzes images and responds to text prompts about visual content. Takes an image and a text prompt as input, then gener...
Answers questions about images using natural language. Takes an image and text prompt as input and generates contextual...
Analyze documents and images from one or more image inputs plus a text prompt, returning text captions, OCR, and answers...
Answer questions about images and generate captions from an image and a text prompt, outputting text. Perform visual que...
Answer questions about images from an image and text prompt, returning text. Perform visual question answering (VQA), im...
Generate text responses from prompts with advanced reasoning, code generation, and image analysis capabilities. Supports...