
peter65374/cog-resnet
Classify images by identifying objects and assigning confidence scores to each detected object.
Found 49 models (showing 21-40)
Classify images by identifying objects and assigning confidence scores to each detected object.
Answer visual questions and caption images from a text prompt and an optional image, returning text. Support multi-turn...
Generate text from prompts or chat messages with ultra-low latency and a 1 million token context window. Handle high-thr...
Answer questions about images and generate captions from an image and a natural-language query. Takes an image plus a te...
Detect the likelihood of deepfake faceswaps in images. This model focuses on identifying faceswaps with high confidence,...
Generate text and answer questions about images from a text prompt, returning text outputs for chat, captioning, and vis...
Answer questions about images. Accept an image and a natural-language prompt and return text, enabling visual question a...
Compute an integer from an input image. Accepts an image and outputs a numeric value, useful for testing image input pip...
Generate fine-grained captions for images using a CLIP-based reward system. This model evaluates image captions based on...
Answer questions about images and extract information, returning text. Accepts an image plus a text prompt and outputs t...
Answer questions about images and text with step-by-step reasoning. Accepts a text prompt and an optional image, and out...
Generate text responses from prompts or chat messages, with optional image inputs for visual reasoning. Accepts text and...
Generate and reason over text from prompts or chat messages, with optional image inputs, returning text outputs. Solve m...
Answer questions about images and generate captions from an image plus a text prompt, returning text. Analyze photos, do...
Caption images and answer visual questions from an image and a text prompt, returning text. Add visual understanding to...
Analyze images in conversational chat to answer questions, caption scenes, and localize objects with bounding boxes. Acc...
Generate text and analyze images from prompts or chat messages, optimized for low latency and cost. Accepts text and opt...
Generate text and understand images from text and optional image inputs. Handle chat, question answering, document summa...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Caption images and answer visual questions from an input image, returning text. Accept an image plus an instruction prom...