lucataco/idefics-8b 🖼️📝🔢 → 📝
About
Idefics2 is an open multimodal model that accepts arbitrary sequences of image and text inputs and produces text outputs

Example Output
Prompt:
"Where is this pastry from?"
Output
Turkey.
Performance Metrics
3.99s
Prediction Time
158.63s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/KnG23ICcKFDi6YLBeGt9N3pncNTShrG6oxiekeG7KwlgQugr/baklava.png", "prompt": "Where is this pastry from?", "max_new_tokens": 512, "repetition_penalty": 1.2 }
Input Parameters
- image (required)
- Grayscale input image
- prompt
- Imput prompt
- max_new_tokens
- Maximum number of tokens to generate
- repetition_penalty
- Repetition penalty
Output Schema
Output
Example Execution Logs
No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set `tokenizer.chat_template` to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information. The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead. INPUT: User:<image>Where is this pastry from?<end_of_utterance> Assistant: |OUTPUT: ['Turkey.']
Version Details
- Version ID
7ab312514f213130c4a2db68b93a1719f5cc7c3246c408ba91d507b212a24303
- Version Created
- April 22, 2024