cuuupid/glm-4v-9b
Answer questions about images and extract text from images. Takes an image and a text prompt and returns a text response...
Found 90 models (showing 41-60)
Answer questions about images and extract text from images. Takes an image and a text prompt and returns a text response...
Caption images and answer visual questions from a text prompt and optional image, returning text. Support long-context i...
Analyze images and answer questions about visual content using a Mixture-of-Experts vision-language model. Takes an imag...
Analyzes images and generates text descriptions or responses to prompts about visual content. Processes diverse image ty...
Generates text responses from text prompts, messages, and images with multimodal capabilities. Processes both text and v...
Answer questions about images. Accepts an image and a natural-language question and returns a text answer for visual que...
Caption images. Takes a single image as input and returns a concise natural-language description of the scene, objects,...
Answer questions about an image and generate captions and summaries. Accepts a single image and a natural-language quest...
Caption images and answer visual questions from an input image. Optionally evaluate imageβtext matching. Provide an imag...
Analyzes images and answers questions about them using a visual language model. Takes an image and a text query as input...
Analyzes images and answers questions about visual content through multimodal conversation. Designed as a foundation mod...
Generate text responses with advanced reasoning capabilities, specializing in math, coding, and visual analysis. Process...
Caption images and answer visual questions from an input image and text prompt. Accept an image plus a question or instr...
Caption images and answer visual questions from an input image, returning text. Accepts an image and a natural-language...
Generate text based on text prompts and optional image inputs. This multimodal language model handles both text and imag...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Caption images and answer visual questions from an image plus an optional text prompt, returning text. Handle OCR-style...
Generate text for chat, Q&A, coding, and document workflows with fast, low-latency responses. Accept text prompts and op...
Generate text responses from prompts with support for image analysis and visual understanding. Fast, lightweight languag...
Generate and reason over text with optional image inputs, returning text outputs. Handle long-context tasks with a 200k-...