cuuupid/glm-4v-9b
Answer questions about images and extract text from images. Takes an image and a text prompt and returns a text response...
Found 89 models (showing 41-60)
Answer questions about images and extract text from images. Takes an image and a text prompt and returns a text response...
Caption images and answer visual questions from a text prompt and optional image, returning text. Support long-context i...
Answer questions about images, documents, charts, and tables. Takes an image and a text prompt and returns text. Support...
Caption images and answer visual questions from an input image and text prompt. Accepts an image and a prompt; outputs t...
Chat with a multimodal large language model using text and optional images as input and receive streamed text outputs. G...
Answer questions about images. Accepts an image and a natural-language question and returns a text answer for visual que...
Caption images. Takes a single image as input and returns a concise natural-language description of the scene, objects,...
Answer questions about an image and generate captions and summaries. Accepts a single image and a natural-language quest...
Caption images and answer visual questions from an input image. Optionally evaluate imageβtext matching. Provide an imag...
Caption images and answer visual questions from an input image and text query, returning a text response. Handle general...
Answer questions about images. Accepts an image and an optional text prompt and returns a text response for visual quest...
Answer questions, write code, and analyze images with a fast, costβefficient reasoning model. Accept a single prompt or...
Caption images and answer visual questions from an input image and text prompt. Accept an image plus a question or instr...
Caption images and answer visual questions from an input image, returning text. Accepts an image and a natural-language...
Generate and analyze text with optional image input. Accept a text prompt and an optional image and return text, support...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Caption images and answer visual questions from an image plus an optional text prompt, returning text. Handle OCR-style...
Generate text for chat, Q&A, coding, and document workflows with fast, low-latency responses. Accept text prompts and op...
Chat and generate text with low latency and cost, with optional image inputs for visual reasoning and captioning. Accept...
Generate and reason over text with optional image inputs, returning text outputs. Handle long-context tasks with a 200k-...