
lidarbtc/kollava-v1.5
Answer questions about images in Korean. Takes an image and a Korean text prompt and returns Korean text, supporting vis...
Found 122 models (showing 41-60)
Answer questions about images in Korean. Takes an image and a Korean text prompt and returns Korean text, supporting vis...
Answer questions about images. Provide an image and a natural-language question and receive a text response. Handle gene...
Caption images and answer questions about images. Takes an image plus an optional question and prior Q/A context and ret...
Generate text prompts from an input image for use with text-to-image models like Stable Diffusion. Analyze the image wit...
Estimate a person's age from an image. Accepts a photo of a person and returns the predicted age as an integer (1–99). U...
Answer questions about an input image and generate captions. Takes an image plus a text prompt and returns a text respon...
Generate optimized text prompts for text-to-image models from an input image. Combine CLIP and BLIP to extract subject,...
Generate and reason over text from prompts or chat messages, with optional image inputs, returning text outputs. Solve m...
Identify bird species and answer bird-related questions from an input image and text prompt. Accepts a bird photo and a...
Answer questions about images and GUI screenshots, returning text responses or, for UI tasks, a step-by-step plan with n...
Generate text and analyze images from prompts or chat messages, optimized for low latency and cost. Accepts text and opt...
Generate text and analyze images with a fast, low-cost multimodal GPT-4o variant. Accept text prompts or chat message ar...
Generate text responses from prompts or chat messages, with optional image inputs for visual reasoning. Accepts text and...
Generate text from prompts or chat messages and answer questions about images. Accepts text and optional images as input...
Generate and reason over text from prompts, with optional image analysis. Accepts text and an optional image, and return...
Answer questions about images in a multi-turn chat from an image and text prompt, returning text. Perform visual questio...
Answer visual questions and caption images from a text prompt and an optional image, returning text. Support multi-turn...
Analyze images and return text responses for captioning and visual question answering. Accept an image and a natural-lan...
Answer questions about images and generate text from an image plus a prompt. Accepts a single image and textual instruct...
Answer questions about an input image. Accepts an image and a natural-language question and returns a text answer, enabl...