
spuuntries/urna-kp3l
Answer questions about an image and generate captions from a text prompt and image input. Accepts an image and a natural...
Found 122 models (showing 1-20)
Answer questions about an image and generate captions from a text prompt and image input. Accepts an image and a natural...
Generate and edit images from text and answer questions about images. Accepts a text prompt for text-to-image generation...
Segment objects and regions in images from natural-language instructions and answer visual questions. Takes an image and...
Segment objects and answer questions from an input image using natural-language instructions. Provide an image and an in...
Segment objects in images from a text instruction, returning a segmentation overlay image and a textual response. Suppor...
Caption images and long videos and answer visual questions, returning text. Accepts an image or video plus an instructio...
Analyze images in conversational chat to answer questions, caption scenes, and localize objects with bounding boxes. Acc...
Extract and reason over content in document images. Accepts an image and a text prompt or question, and returns text ans...
Caption images from a single input image. Answer visual questions about the image and evaluate image-text matching by ch...
Answer questions about images and generate captions. Takes an image and a natural-language question as input and returns...
Answer questions about an input image and generate captions, returning text. Accept an image plus a question, or enable...
Answer questions about images and extract text from images. Takes an image and a text prompt and returns text, enabling...
Generate descriptive tags from an input image. Accepts a single image and returns a comma-separated list of booru-style...
Answer questions about images from an image input and a text prompt, returning text. Support visual question answering (...
Generate text from prompts or chat messages with ultra-low latency and a 1 million token context window. Handle high-thr...
Generate text quickly for chat, question answering, code generation, classification, summarization, and translation. Acc...
Answer questions about images. Accepts an image and a text prompt and returns text, enabling visual question answering,...
Caption images and answer visual questions from an input image and text prompt. Accepts an image and a natural-language...
Answer questions about images and produce text output. Handle image captioning, visual question answering (VQA), optical...
Answer questions about images and videos. Accepts an image or a video plus a question and returns text, enabling visual...