
lucataco/qwen-vl-chat
Answer questions about images. Provide an image and a natural-language question to receive a text response that handles...
Found 45 models (showing 1-20)
Answer questions about images. Provide an image and a natural-language question to receive a text response that handles...
Analyze images to identify unusual or noteworthy elements based on textual prompts. This model processes an input image...
Answer questions about images and extract text from images. Takes an image and a text prompt and returns text, enabling...
Answer questions about images in a multi-turn chat from an image and text prompt, returning text. Perform visual questio...
Generate captions and answer visual questions for images and videos from a text prompt. Accepts a single image or a vide...
Answer questions about images from an image and text prompt, returning text. Perform visual question answering (VQA), im...
Answer questions about images from an image input and a text prompt, returning text. Support visual question answering (...
Answer questions about images and caption them from an image plus text prompt, returning text. Perform visual recognitio...
Segment objects and answer questions from an input image using natural-language instructions. Provide an image and an in...
Generate and reason over text from prompts or chat messages, with optional image inputs, returning text outputs. Solve m...
Generate and reason over text for chat, coding, and complex problem solving. Accept prompts or multi-turn messages and o...
Generate text and code from a prompt, with optional image input for captioning and visual analysis. Supports fast standa...
Answer questions about images and GUI screenshots, returning text responses or, for UI tasks, a step-by-step plan with n...
Generate text quickly for chat, question answering, code generation, classification, summarization, and translation. Acc...
Generate text and analyze images from prompts or chat messages, optimized for low latency and cost. Accepts text and opt...
Generate text for chat, coding, and reasoning from text prompts or multi-turn messages, with optional image input for an...
Generate text and analyze images with a fast, low-cost multimodal GPT-4o variant. Accept text prompts or chat message ar...
Generate text responses from prompts or chat messages, with optional image inputs for visual reasoning. Accepts text and...
Solve complex reasoning tasks and generate text from prompts or chat messages. Accept text or messages and optional imag...
Generate and reason over text from a prompt, optionally analyze images to extract data and answer visual questions. Prod...