nelsonjchen/minigpt-4_vicuna-13b
Answer questions about images and generate captions. Accepts an image and an optional text prompt/question, and returns...
Found 76 models (showing 61-76)
Answer questions about images and generate captions. Accepts an image and an optional text prompt/question, and returns...
Analyze images and answer questions from an image plus text prompt, returning text. Handle visual question answering (VQ...
Answer questions about images in Korean. Take an image and a Korean prompt and generate Korean text for visual question...
Analyze images or video and generate text captions, answers, and summaries. Accepts single or multiple images or a video...
Caption images and videos and answer visual questions. Accepts an optional image or video plus a text prompt and returns...
Answer questions about images and generate captions from an image and a text prompt, outputting text. Perform visual que...
Answer questions about images from a text prompt and return a text response. Accept a single image plus a natural-langua...
Visualize which regions of an image CLIP associates with a given text prompt. Generates a saliency heatmap, optionally o...
Generate and reason over text from prompts, with optional image, audio, and video inputs. Produce answers, explanations,...
Generate text, captions, and summaries from text, image, and video inputs. Support question answering, code generation a...
Answer questions and caption images from a text prompt and an optional image, returning text. Generate long-form text an...
Estimate 2D poses of multiple people in an image using a lightweight version of OpenPose. Outputs include 18 keypoints p...
Analyze images and generate detailed textual descriptions based on visual content. Supports input via image URLs or base...
Generate text responses from text, image, and audio inputs. Perform image captioning and visual question answering, OCR,...
Analyze images and text to generate answers, working code, and polished documents. Takes a text prompt with an optional...
Analyze medical images and medical text to generate non-clinical explanations and answers. Accepts a text prompt and opt...