peter65374/cog-resnet
Classify images by identifying objects and assigning confidence scores to each detected object.
Found 76 models (showing 21-40)
Classify images by identifying objects and assigning confidence scores to each detected object.
Answer visual questions and caption images from a text prompt and an optional image, returning text. Support multi-turn...
Generate and classify text from prompts or chat messages with ultra-low latency and up to a 1M-token context window. Acc...
Answer questions about images and generate captions from an image and a text query, returning text. Accept a single imag...
Detect the likelihood of deepfake faceswaps in images. This model focuses on identifying faceswaps with high confidence,...
Generate text and analyze images from a text prompt (optionally with an image), returning text for conversation, caption...
Answer questions about images. Takes an image and a text prompt and returns a text response, enabling visual question an...
Compute an integer from an input image. Accepts an image and outputs a numeric value, useful for testing image input pip...
Generate fine-grained captions for images using a CLIP-based reward system. This model evaluates image captions based on...
Answer questions about images, documents, charts, and tables. Takes an image and a text prompt and returns text. Support...
Answer questions about images and text with multimodal reasoning. Takes a text prompt with an optional image and outputs...
Chat with a multimodal large language model using text and optional images as input and receive streamed text outputs. G...
Answer questions, write code, and analyze images with a fast, costβefficient reasoning model. Accept a single prompt or...
Caption images and answer visual questions from an input image and text prompt. Accept an image plus a question or instr...
Answer questions and caption images from an input image and text prompt. Accept an image plus a natural-language query a...
Answer questions about images, caption scenes, and localize entities with bounding boxes. Accept a ChatML-formatted prom...
Chat and generate text with low latency and cost, with optional image inputs for visual reasoning and captioning. Accept...
Generate and analyze text with optional image input. Accept a text prompt and an optional image and return text, support...
Answer questions about images and documents from an image and a text prompt, returning text. Handle visual question answ...
Caption images and answer visual questions from an image plus an optional text prompt, returning text. Handle OCR-style...