
spuuntries/urna-kp3l
Answer questions about an image and generate captions from a text prompt and image input. Accepts an image and a natural...
Found 118 models (showing 1-20)
Answer questions about an image and generate captions from a text prompt and image input. Accepts an image and a natural...
Generate images from text, edit images with natural-language instructions, and answer questions about images in one unif...
Segment objects and regions in images from natural-language instructions and answer visual questions. Takes an image and...
Segment objects and answer questions from an input image using natural-language instructions. Provide an image and an in...
Segment objects in images from a text instruction, returning a segmentation overlay image and a textual response. Suppor...
Caption images and long videos and answer visual questions, returning text. Accepts an image or video plus an instructio...
Analyze images in conversational chat to answer questions, caption scenes, and localize objects with bounding boxes. Acc...
Extract text and answer questions about document images. Accepts an image and a text prompt and outputs text for OCR, ta...
Caption images from a single input image. Answer visual questions about the image and evaluate image-text matching by ch...
Answer questions about images and generate captions. Takes an image and a natural-language question as input and returns...
Answer questions about an input image and generate captions, returning text. Accept an image plus a question, or enable...
Answer questions about images, perform OCR, and caption visual content. Takes an image and a text prompt and outputs tex...
Generate descriptive tags from an input image. Accepts a single image and returns a comma-separated list of booru-style...
Answer questions about images from an image input and a text prompt, returning text. Support visual question answering (...
Generate text from prompts or chat messages with ultra-low latency and a 1 million token context window. Handle high-thr...
Generate text quickly for chat, question answering, code generation, classification, summarization, and translation. Acc...
Answer questions about images. Accepts an image and a text prompt and returns text, enabling visual question answering,...
Caption images and answer visual questions from an input image and text prompt. Accepts an image and a natural-language...
Answer questions about images and produce text output. Handle image captioning, visual question answering (VQA), optical...
Answer questions about images and videos. Accepts an image or a video plus a question and returns text, enabling visual...