🤖 Model 🖼️
aliakbarghayoori/dfn5b-clip-vit-h-14-384
Embed images and text into a shared CLIP vector space for similarity search and zero-shot classification. Accepts lists...
Found 3 models (showing 1-3)
Embed images and text into a shared CLIP vector space for similarity search and zero-shot classification. Accepts lists...
Generate joint text and image embeddings for semantic search and cross‑modal retrieval. Accepts a single text string or...
Create multilingual text and image embeddings for cross-modal search, retrieval, and similarity. Accept text (up to 8192...