image-captioning AI Models - Page 2

anthropic/claude-4-sonnet

Generate text content from prompts with advanced reasoning and coding capabilities. Claude Sonnet 4 supports both standa...

🖼️ → 📝 • text-generation • code-generation • image-to-text • 3.0M runs

🤖 Model 🖼️ → 📝

openai/gpt-5-nano

Generates text responses based on prompts or conversation messages, with support for image input analysis. This is the f...

🖼️ → 📝 • text-generation • image-to-text • code-generation • 13.9M runs

🤖 Model 🖼️ → 📝

openai/gpt-5-mini

Generates text responses based on prompts or multi-turn conversations, designed as a faster and more cost-effective vers...

🖼️ → 📝 • text-generation • image-to-text • code-generation • 2.3M runs

🤖 Model 🖼️ → 📝

openai/o1

Generate text responses with advanced reasoning capabilities, specializing in complex problem-solving across mathematics...

🖼️ → 📝 • text-generation • code-generation • image-to-text • 18.7K runs

🤖 Model 🖼️ → 📝

anthropic/claude-3.7-sonnet

Generate text responses based on prompts with support for image analysis. Features particularly strong capabilities in c...

🖼️ → 📝 • text-generation • image-to-text • code-generation • 4.1M runs

🤖 Model 🖼️ → 📝

daanelson/minigpt-4

Generates text descriptions, stories, and responses based on input images and prompts. Takes an image and text prompt as...

🖼️ → 📝 • image-to-text • text-generation • visual-understanding • 1.8M runs

🤖 Model 📝 → 📝

lucataco/image-caption

Generate captions for images using a simple GPT-5-mini wrapper. Input an image and receive a descriptive text output tha...

📝 → 📝 • image-captioning • text-generation • visual-understanding • 0 runs

🤖 Model 🖼️ → 📝

nelsonjchen/minigpt-4_vicuna-13b

Answer questions about images and generate detailed image captions using MiniGPT-4 with Vicuna-13B language model. Takes...

🖼️ → 📝 • image-to-text • visual-understanding • image-analysis • 52.0K runs

🤖 Model 🖼️ → 📝

anthropic/claude-4.5-sonnet

Generates text responses based on prompts and can analyze images. Excels at coding tasks with state-of-the-art performan...

🖼️ → 📝 • text-generation • code-generation • code-understanding • 1.4M runs

🤖 Model 🖼️ → 📝

yorickvp/llava-v1.6-mistral-7b

Multimodal language model that analyzes images and generates text responses based on visual content and text prompts. Bu...

🖼️ → 📝 • image-to-text • text-generation • visual-understanding • 5.0M runs

🤖 Model 🖼️ → 📝

nelsonjchen/minigpt-4_vicuna-7b

Analyzes images and answers questions about them using MiniGPT-4 with Vicuna-7B language model. Takes an image and an op...

🖼️ → 📝 • image-to-text • image-captioning • 9.9K runs

🤖 Model 🖼️ → 📝

yimi81/yi-vl-6b

Answer questions about images and generate captions from an image and a text query, returning text. Accept a single imag...

🖼️ → 📝 • image-to-text • visual-question-answering • image-captioning • 309 runs

🤖 Model 🖼️ → 📝

nomagick/qwen-vl-chat

Generates text responses based on text prompts and images with ChatML prompt interface and streaming support. Accepts up...

🖼️ → 📝 • text-generation • image-to-text • image-analysis • 1.1K runs

🤖 Model 🖼️ → 📝

muqtadar08/image_to_text

Converts images into text descriptions or captions.

🖼️ → 📝 • image-to-text • 43 runs

🤖 Model 🖼️ → 📝

lucataco/moondream-0.5b

Answer questions about images and generate captions from an image input. Takes an image and a text prompt (e.g., “Descri...

🖼️ → 📝 • image-to-text • visual-question-answering • image-captioning • 64 runs

🤖 Model 🖼️ → 📝

webnizam/image-caption

Caption images. Takes an input image and returns a short natural-language description as text, useful for alt text, acce...

🖼️ → 📝 • image-to-text • image-captioning • 2.8K runs

🤖 Model 📝 → 📝

yoadtew/test

Generate captions for images by combining three input images using a mathematical operation. The model outputs text desc...

📝 → 📝 • image-captioning • image-combination • text-generation • 147 runs

🤖 Model 🖼️ → 📝

nohamoamary/image-captioning-with-visual-attention

Generates text captions describing the content of images using an attention-based neural network trained on the Flickr8k...

🖼️ → 📝 • image-to-text • 11.3K runs

🤖 Model 🖼️ → 📝

ignaciosgithub/pllava

Answer questions about images from an image and a text prompt, returning text. Generate captions, short answers, and exp...

🖼️ → 📝 • image-to-text • visual-question-answering • 298 runs

🤖 Model 🖼️

j-min/clip-caption-reward

Generate fine-grained captions for images using a CLIP-based reward system. This model evaluates image captions based on...

🖼️ • image-captioning • clip-reward • fine-grained-captioning • 296.1K runs