lidarbtc/kollava-v1.5
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Found 54 models (showing 21-40)
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Answer questions about images and generate captions from an image input and a natural-language question, returning text....
Caption images and answer visual questions from an input image. Provide an image and either generate a caption or ask a...
Answer questions about images from an image and text prompt, returning text. Perform visual question answering (VQA), im...
Identify bird species and answer bird-related questions from an input image and text prompt, returning text. Perform vis...
Generates text descriptions, stories, and responses based on input images and prompts. Takes an image and text prompt as...
Answer questions about images. Accepts an image and a natural-language question and returns a text answer for visual que...
Answer questions about images and generate detailed image captions using MiniGPT-4 with Vicuna-13B language model. Takes...
Analyzes images and answers questions about them using MiniGPT-4 with Vicuna-7B language model. Takes an image and an op...
Caption images and answer visual questions from an input image and text prompt, returning text. Handle multilingual outp...
Answer questions about images and generate captions from an image input. Takes an image and a text prompt (e.g., βDescri...
Answer questions about images from a text prompt, returning a text response. Accepts an input image and a prompt and out...
Answer questions about images from a text prompt and return a text response. Accept a single image plus a natural-langua...
Answer questions about images from an image and a text prompt, returning text. Generate captions, short answers, and exp...
Caption images and perform visual question answering from an image and a text prompt, returning a text response. Choose...
Analyze images and answer questions about visual content using a Mixture-of-Experts vision-language model. Processes sin...
Answer questions about an image and generate captions and summaries. Accepts a single image and a natural-language quest...
Caption images and answer visual questions from an input image, returning text. Accepts an image and a natural-language...
Analyzes images and generates text descriptions or answers questions about visual content. Uses a projection module trai...
Analyze documents and images from one or more image inputs plus a text prompt, returning text captions, OCR, and answers...