lidarbtc/kollava-v1.5
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Found 52 models (showing 21-40)
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Answer questions about images and generate captions from an image input and a natural-language question, returning text....
Caption images and answer visual questions from an input image. Provide an image and either generate a caption or ask a...
Answer questions about images from an image and text prompt, returning text. Perform visual question answering (VQA), im...
Identify bird species and answer bird-related questions from an input image and text prompt, returning text. Perform vis...
Answer questions about images and generate image-grounded text. Takes an image and a text prompt and returns text, enabl...
Answer questions about images. Accepts an image and a natural-language question and returns a text answer for visual que...
Answer questions about images and generate captions. Accepts an image and an optional text prompt/question, and returns...
Caption images and answer natural-language questions about an input image. Provide an image and an optional instruction...
Caption images and answer visual questions from an input image and text prompt, returning text. Handle multilingual outp...
Answer questions about images and generate captions from an image input. Takes an image and a text prompt (e.g., βDescri...
Answer questions about images from a text prompt, returning a text response. Accepts an input image and a prompt and out...
Answer questions about images from a text prompt and return a text response. Accept a single image plus a natural-langua...
Answer questions about images from an image and a text prompt, returning text. Generate captions, short answers, and exp...
Caption images and perform visual question answering from an image and a text prompt, returning a text response. Choose...
Answer questions about images and documents. Accepts 1β3 images plus an instruction and returns text for tasks like visu...
Answer questions about an image and generate captions and summaries. Accepts a single image and a natural-language quest...
Caption images and answer visual questions from an input image, returning text. Accepts an image and a natural-language...
Answer questions and caption images from an input image and text prompt. Accept an image plus a natural-language query a...
Analyze documents and images from one or more image inputs plus a text prompt, returning text captions, OCR, and answers...