zsxkib/idefics3 📝🖼️🔢❓ → 📝

▶️ 2.7K runs 📅 Aug 2024 ⚙️ Cog 0.9.14 📄 Paper ⚖️ License

image-analysis image-captioning image-to-text visual-question-answering visual-understanding

Performance

3.4sTypical run time

~161sCold start (first call)

2.7KTotal runs

About

Idefics3-8B-Llama3, Answers questions and caption about images

Example Output

Output

A white dog is sitting on the bench. The background of the image is blurred, but we can still see trees and dry grass in the background. There are clouds visible in the sky.

Performance Metrics

3.45s Prediction Time

161.23s Total Time

All Input Parameters

{
  "text": "What do you see? Give me a detailed answer",
  "image": "https://replicate.delivery/pbxt/LRy82RONNFuqeS0JjwoxJQVxJMkxQ73xdshWr9mhXmRPJWjy/dogonbench.png",
  "top_p": 0.8,
  "temperature": 0.4,
  "max_new_tokens": 512,
  "assistant_prefix": "Let's think step by step.",
  "decoding_strategy": "top-p-sampling",
  "repetition_penalty": 1.2
}

Input Parameters

text (required) Type: string: Text query
image (required) Type: string: Upload your Image
top_p Type: numberDefault: 0.8Range: 0.01 - 0.99: Top P for sampling
temperature Type: numberDefault: 0.4Range: 0 - 5: Temperature for sampling
max_new_tokens Type: integerDefault: 512Range: 8 - 1024: Maximum number of new tokens
assistant_prefix Type: stringDefault: Let's think step by step.: Assistant Prefix
decoding_strategy Default: greedy: Decoding strategy
repetition_penalty Type: numberDefault: 1.2Range: 0.01 - 5: Repetition penalty

Output Schema

Output

Type: string

Version Details

Version ID: b06f5f6b6249b27d0b00d1b794240e5641190d1582ad68c40ef53778459bb593
Version Created: August 15, 2024

Run on Replicate →