lucataco/llama-3-vision-alpha 🖼️📝 → 📝
About
Projection module trained to add vision capabilties to Llama 3 using SigLIP

Example Output
Prompt:
"Describe the image"
Output
The image is of a young girl with short, curly hair and bright blue eyes. She has a sweet and cheerful face, with a hint of mischief in her eyes. She is holding a giant hamburger in her hand, with a big bite taken out of it. She looks like she's enjoying every bite of her meal, with a satisfied expression on her face. She is sitting at a table in a medieval - style tavern, surrounded by wooden tables and chairs, with a fireplace crackling in the background. The atmosphere is cozy and inviting, with a warm glow emanating from the fireplace. The girl looks like she's in her element, enjoying a hearty meal with friends and family.
Performance Metrics
7.22s
Prediction Time
213.54s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/Kq17Ws2RLIXdeFeep2N56psrMVq57TPssPrffeF8HawmOhvD/frieren.jpg", "prompt": "Describe the image" }
Input Parameters
- image (required)
- Input image
- prompt
- Input prompt
Output Schema
Output
Version Details
- Version ID
79f9f3418f65fb5d180978a969d5646e33f8ca67430a2fe903c9a4be82565925
- Version Created
- November 5, 2024