lucataco/paligemma-3b-pt-224 🖼️📝 → 📝
About
PaliGemma 3B, an open VLM by Google, pre-trained with 224*224 input images and 128 token input/output text sequences

Example Output
Prompt:
"caption es"
Output
persona estacionada en una calle
Performance Metrics
0.58s
Prediction Time
0.63s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/Kv6Dn1Mk1tZe7vfVaRuPNBJcoDBYhRGQ33OTkq70l375ULSi/car.jpg", "prompt": "caption es" }
Input Parameters
- image (required)
- Grayscale input image
- prompt
- Input prompt
Output Schema
Output
Version Details
- Version ID
c519755cce71af83c3831c3b3b7fe6c1de4a4dc27eff91f9e79639e14924a078
- Version Created
- May 14, 2024