zsxkib/jina-clip-v2 📝🖼️🔢❓ → ❓

▶️ 654.1K runs 📅 Nov 2024 ⚙️ Cog 0.13.2 📄 Paper ⚖️ License
image-embedding multilingual text-embedding

About

Jina-CLIP v2: 0.9B multimodal embedding model with 89-language multilingual support, 512x512 image resolution, and Matryoshka representations

Example Output

Output

uAhEvoVqwDyFtDg+aP6rvHA6xD1iKvY9dBz/vBJCXz4xFSY92mw/vl3zSz5xY0e8ugd0vmo/Ij4X/N09yuLcOxs+nb0fJcU8CgtDPqJKKr3mXLO9Zm6DvTobtT1Evk++mzNqvLbZJT6YEDw+ngaOvXOaab2gofQ8qrAGvkkrgD67Dws+6nKbvnk34L2bAvI9FlNKOsW6W73YY5K+bVmkvWj2Lj3TyUK7EYxAPj4/4Ly0mY294LcePeQggTzc89e9gMUNvpyvujtA/hi9O40APAwixL25WwM+b7jzO9cHvj0gboc9bxyUvTXXXL0VoQQ9cwQnPUpHHj5pH7i9eRNgvg==eR/JvaEGCj5U8Lw8oGMEPhHuD74frls8R6XovaMmTLwPaIC9fXW6vF5ahDxqi929Zmtbvk7tJT5t7CM8w89zPWeogz3HHIw9jd8dPvUD3b0wQki9Kpt2vWBYmb1Xxam9kz7yPXAA6b23roS9qQiovXZZ9LxOgw8+7xlBvRMOPD4fskc9zONKvjLOYb4/ed49KAKxO1Bwib35RqQ8G3EKPu/TCj4xFdE9O4g8Plzrib2NUNO9rUwkPvxIQj2QbXW+PUDrPYbdlb2JUke+lRMOPS3OJb5XPAO+05oBPRSPAj0DBSC9aYQmPi4Fib7/pZS+oj0kvUriED79nTm+vE5Vvg==

Performance Metrics

1.13s Prediction Time
121.01s Total Time
All Input Parameters
{
  "text": "A cute fox",
  "image": "https://images.stockcake.com/public/f/b/a/fba4da70-95d2-4e86-825b-38c32b15f678_large/fox-in-flight-stockcake.jpg",
  "embedding_dim": 64,
  "output_format": "base64"
}
Input Parameters
text Type: string
Text content to embed (up to 8192 tokens). If both text and image provided, text embedding will be first in returned list.
image Type: string
Image file to embed (optimal size: 512x512). If both text and image provided, image embedding will be second in returned list.
embedding_dim Type: integerDefault: 64Range: 64 - 1024
Matryoshka dimension - output embedding dimension (64-1024)
output_format Default: base64
Format to use in outputs
Output Schema

Output

Type: array

Version Details
Version ID
5050c3108bab23981802011a3c76ee327cc0dbfdd31a2f4ef1ee8ef0d3f0b448
Version Created
November 28, 2024
Run on Replicate →