lidarbtc/kollava-v1.5
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Found 154 models (showing 41-60)
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Answer questions about images and generate captions from an image input and a natural-language question, returning text....
Caption images and answer visual questions from an input image. Provide an image and either generate a caption or ask a...
Generate text prompts from an input image for use with text-to-image models. Analyze artists, mediums, and styles using...
Predict a person's age from an input image. Takes a photo containing a face and returns an estimated age (1β99) as text....
Answer questions about images from an image and text prompt, returning text. Perform visual question answering (VQA), im...
Generate text prompts from an input image. Combine CLIP and BLIP to analyze the image and produce descriptive prompts op...
Answer questions, write code, and analyze images with a fast, costβefficient reasoning model. Accept a single prompt or...
Identify bird species and answer bird-related questions from an input image and text prompt, returning text. Perform vis...
Answer questions about images and GUI screenshots. Takes an image and a natural-language query and returns a text respon...
Chat and generate text with low latency and cost, with optional image inputs for visual reasoning and captioning. Accept...
Generate and chat in natural language from text prompts, with optional image inputs for visual understanding and image-t...
Chat with a multimodal large language model using text and optional images as input and receive streamed text outputs. G...
Generate and reason over text and code with up to a 1M-token context, and analyze images to produce text answers, captio...
Generate and reason over text with optional image inputs, returning text outputs. Handle long-context tasks with a 200k-...
Analyze images and answer questions about them in natural language. Accepts a text prompt and an optional image and retu...
Answer visual questions and caption images from a text prompt and an optional image, returning text. Support multi-turn...
Analyze images and return text responses for captioning and visual question answering. Accept an image and a natural-lan...
Answer questions about images and generate image-grounded text. Takes an image and a text prompt and returns text, enabl...
Answer questions about images. Accepts an image and a natural-language question and returns a text answer for visual que...