zsxkib/molmo-7b 📝🖼️🔢 → 📝

▶️ 552.2K runs 📅 Sep 2024 ⚙️ Cog 0.9.24 🔗 GitHub 📄 Paper ⚖️ License
image-analysis image-captioning visual-understanding

About

allenai/Molmo-7B-D-0924, Answers questions and caption about images

Example Output

Output

I see a charming scene featuring a large, fluffy white dog sitting on a wooden bench in the middle of a field. The dog appears to be a poodle mix, with curly fur covering its entire body. It's sitting upright with its front paws hanging over the edge of the bench, looking directly at the camera with a happy expression. The dog's tongue is hanging out, and its eyes are dark and expressive.

The bench is made of dark brown wood with a lattice design on the backrest. It's positioned on a patch of brown, dead grass, which suggests it might be late fall or early winter. In the background, there's a vast field filled with tall, brown grass, and further back, I can see a line of bare trees against a light blue sky.

The overall atmosphere is peaceful and serene. The dog seems to be enjoying its time outdoors, perhaps waiting for its owner or just taking in the scenery. The contrast between the white, fluffy dog and the brown

Performance Metrics

11.97s Prediction Time
14.54s Total Time
All Input Parameters
{
  "text": "What do you see? Give me a detailed answer",
  "image": "https://replicate.delivery/pbxt/LRy82RONNFuqeS0JjwoxJQVxJMkxQ73xdshWr9mhXmRPJWjy/dogonbench.png",
  "top_k": 50,
  "top_p": 1,
  "temperature": 1,
  "length_penalty": 1,
  "max_new_tokens": 200
}
Input Parameters
text (required) Type: string
Text prompt or question about the image
image (required) Type: string
Input image
top_k Type: integerDefault: 50Range: 1 - 100
Number of highest probability vocabulary tokens to keep for top-k-filtering
top_p Type: numberDefault: 1Range: 0 - 1
Cumulative probability for top-p-filtering
temperature Type: numberDefault: 1Range: 0.1 - 2
Randomness in token selection (higher values increase randomness)
length_penalty Type: numberDefault: 1Range: 0.1 - 2
Exponential penalty to the length (values < 1.0 encourage shorter outputs, > 1.0 encourage longer outputs)
max_new_tokens Type: integerDefault: 200Range: 1 - 1000
Maximum number of new tokens to generate
Output Schema

Output

Type: string

Version Details
Version ID
76ebd700864218a4ca97ac1ccff068be7222272859f9ea2ae1dd4ac073fa8de8
Version Created
September 26, 2024
Run on Replicate →