
lucataco/nemotron-nano-vl-8b-v1
Answer questions about images and generate captions and summaries. Accepts one image and a natural-language question and...
Found 126 models (showing 101-120)
Answer questions about images and generate captions and summaries. Accepts one image and a natural-language question and...
Generate text and code from prompts, with optional image analysis and visual question answering. Accepts a text prompt a...
Generate SDXL-ready text prompts from an input image. Return detailed captions with styles, mediums, artists, and keywor...
Extract text prompts from an input image for Stable Diffusion XL (SDXL). Returns a CLIP-Interrogator-style prompt with a...
Classify images into 1,000 ImageNet categories with ResNet-50, returning top labels with probabilities. Accepts an image...
Answer questions about images. Accepts an image and an optional text prompt and returns a text response for visual quest...
Caption images and answer visual questions from an input image, returning text. Accepts an image and a natural-language...
Caption images, detect objects, and extract text from an input image, returning text outputs. Accepts an image plus a ta...
Answer questions about images. Provide an image and a natural-language question to receive a text response that handles...
Analyze images and generate text responses to prompts. Accepts an image and a text prompt, and outputs text for visual q...
Analyze images to generate captions, detect objects with bounding boxes, and extract text (OCR). Accepts an image plus a...
Caption images and answer visual questions from an input image, returning text. Accept an image plus an instruction prom...
Generate text and analyze images from prompts for chat, Q&A, coding help, and real-time assistants. Accept text (and opt...
Answer questions about images and caption them from an image plus text prompt, returning text. Perform visual recognitio...
Extract dominant hex color codes from an image and answer questions about its contents. Accepts an image and an optional...
Automate GUI actions from a screenshot and a natural-language command. Takes a GUI screenshot image and a text instructi...
Extract on-screen text with pixel coordinates from images and screenshots. Takes an image as input and returns readable...
Classify plant diseases from leaf images. Takes a plant photo as input and returns a predicted cropβdisease class label...
Generate text from text, image, and audio inputs. Handle transcription, summarization, and visual description/QA, includ...
Answer questions about images (visual question answering) from an image and a text prompt, returning text. Use a localit...