anthropic/claude-4-sonnet
Generate and reason over text and code from a prompt, with optional image input for captioning and visual analysis, and...
Found 86 models (showing 21-40)
Generate and reason over text and code from a prompt, with optional image input for captioning and visual analysis, and...
Generate text from prompts or chat messages, with optional image analysis for multimodal reasoning. Handle instruction f...
Generate text from prompts or chat and analyze images to produce captions and grounded answers. Accepts text and optiona...
Solve complex reasoning tasks and generate text responses from prompts, multi-turn chat messages, and images. Accept a s...
Generate and analyze text and code from a prompt, with optional image input for visual understanding and data extraction...
Answer questions about images and generate image-grounded text. Takes an image and a text prompt and returns text, enabl...
Generate captions for images using a simple GPT-5-mini wrapper. Input an image and receive a descriptive text output tha...
Answer questions about images and generate captions. Accepts an image and an optional text prompt/question, and returns...
Generate text and code from a prompt, with optional image analysis for captions and visual reasoning. Accepts a text pro...
Answer visual questions and caption images from a text prompt and an optional image, returning text. Support multi-turn...
Caption images and answer natural-language questions about an input image. Provide an image and an optional instruction...
Answer questions about images and generate captions from an image and a text query, returning text. Accept a single imag...
Answer questions about images, caption scenes, and localize entities with bounding boxes. Accept a ChatML-formatted prom...
Generate text captions from images. Accepts a single image and returns a concise natural-language description of its con...
Answer questions about images and generate captions from an image input. Takes an image and a text prompt (e.g., βDescri...
Caption images. Takes an input image and returns a short natural-language description as text, useful for alt text, acce...
Generate captions for images by combining three input images using a mathematical operation. The model outputs text desc...
Caption images. Accepts an image and outputs a short natural-language description using visual attention to focus on sal...
Answer questions about images from an image and a text prompt, returning text. Generate captions, short answers, and exp...
Generate fine-grained captions for images using a CLIP-based reward system. This model evaluates image captions based on...