peter65374/cog-resnet
Classify images by identifying objects and assigning confidence scores to each detected object.
Found 73 models (showing 21-40)
Classify images by identifying objects and assigning confidence scores to each detected object.
Answer visual questions and caption images from a text prompt and an optional image, returning text. Support multi-turn...
Generate and structure text from prompts or multi-turn chat messages, with optional image inputs for basic visual unders...
Answer questions about images and generate captions from an image and a text query, returning text. Accept a single imag...
Detect the likelihood of deepfake faceswaps in images. This model focuses on identifying faceswaps with high confidence,...
Generate and analyze text with optional image inputs, returning text for tasks like captioning and visual question answe...
Answer questions about images. Takes an image and a text prompt and returns a text response, enabling visual question an...
Compute an integer from an input image. Accepts an image and outputs a numeric value, useful for testing image input pip...
Generate fine-grained captions for images using a CLIP-based reward system. This model evaluates image captions based on...
Answer questions about images, documents, charts, and tables. Takes an image and a text prompt and returns text. Support...
Answer questions about images and text with multimodal reasoning. Takes a text prompt with an optional image and outputs...
Generate and reason over text from prompts and optional images. Accept text or chat-style messages and image inputs, and...
Generate text and analyze images from prompts or multi-turn messages, returning text outputs. Accept multiple image inpu...
Caption images and answer visual questions from an input image and text prompt. Accept an image plus a question or instr...
Answer questions and caption images from an input image and text prompt. Accept an image plus a natural-language query a...
Answer questions about images, caption scenes, and localize entities with bounding boxes. Accept a ChatML-formatted prom...
Generate text and analyze images for chat, coding, and reasoning. Accept text prompts or chat messages with optional ima...
Generate and analyze text with optional image input. Accept a text prompt and an optional image and return text, support...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Caption images and answer visual questions from an image plus an optional text prompt, returning text. Handle OCR-style...