multilingual AI Models - Cloudernative

ai-forever/kandinsky-2.2

Generate images from text prompts in multiple languages. Accepts a text prompt (with optional negative prompt) and retur...

📝 → 🖼️ • text-to-image • multilingual • controlnet • 10.0M runs

🤖 Model 🎥

chenxwh/cogvlm2-video

Caption and answer questions about videos. Accepts a video and a text prompt and returns text outputs such as descriptio...

🎥 • video-to-text • video-auto-captioning • 664.1K runs

🤖 Model

lucataco/qwen3-embedding-8b

Embed text into dense vectors for semantic search, retrieval/reranking, clustering, classification, recommendations, and...

text-embedding • multilingual • 787.1K runs

🤖 Model

beautyyuyanli/multilingual-e5-large

Generate multilingual text embeddings for semantic search, retrieval, and clustering. Takes one or more texts as input a...

text-embedding • multilingual • 30.0M runs

🤖 Model 📝 → 📝

lucataco/qwen1.5-72b

Generate text based on prompts using a 72-billion parameter transformer-based language model. Supports multilingual text...

📝 → 📝 • text-generation • 4.2K runs

🤖 Model 📝 → 📝

lucataco/qwen1.5-7b

Generate text responses based on prompts using a 7B parameter transformer-based decoder-only language model. Supports te...

📝 → 📝 • text-generation • 3.5K runs

🤖 Model 📝 → 🔊

minimax/speech-02-turbo

Generate speech audio from text with low latency for real-time applications. Choose from 300+ prebuilt voices or supply...

📝 → 🔊 • text-to-speech • multilingual-tts • real-time • 5.1M runs

🤖 Model 🖼️ → 🎥

wan-video/wan-2.5-i2v

Generate videos with synchronized audio from input images and text prompts. Uses Alibaba's WAN 2.5 model to create video...

🖼️ → 🎥 • image-to-video-with-audio • lipsync • 213.0K runs

🤖 Model

ibm-granite/granite-embedding-278m-multilingual

Generate multilingual text embeddings for semantic search, similarity, and retrieval. Accepts a list of texts and return...

text-embedding • multilingual • 1.2K runs

🤖 Model 🖼️

zsxkib/jina-clip-v2

Converts text and images into vector embeddings for similarity search and multimodal analysis. Supports 89 languages for...

🖼️ • text-embedding • image-embedding • 943.6K runs

🤖 Model 📝 → 📝

lucataco/qwen1.5-1.8b

Generate and chat in multiple languages from text prompts. Accepts a prompt and optional system prompt and returns text....

📝 → 📝 • text-generation • multilingual • 741 runs

🤖 Model

openai/whisper

Transcribe speech from audio into text. Perform multilingual automatic speech recognition with language detection and op...

speech-to-text • speech-translation • 137.1M runs

🤖 Model 📝 → 🔊

lucataco/higgs-audio-v2

Generate expressive, multilingual speech audio from text input. Produce zero-shot multi-speaker dialogues, emotional del...

📝 → 🔊 • text-to-speech • multilingual • 1.4K runs

🤖 Model 📝 → 🔊

lucataco/xtts-v2

Clone a voice from a short reference clip and synthesize multilingual speech from text. Provide a text prompt and at lea...

📝 → 🔊 • text-to-speech • voice-cloning • 4.4M runs

🤖 Model 📝 → 📝

ibm-granite/granite-3.3-8b-instruct

Generate text responses from conversational prompts and instructions with support for 128K context length. Supports adva...

📝 → 📝 • text-generation • text-embedding • 1.7M runs

🤖 Model 📝 → 🔊

jaaari/kokoro-82m

Generate speech audio from text with selectable multilingual voices. Accepts text input, a preset voice, and a speed mul...

📝 → 🔊 • text-to-speech • multilingual-tts • 53.2M runs

🤖 Model 📝 → 🔊

awerks/neon-tts

Generate speech audio from text across 25 languages. Accepts text and a language selection; returns spoken audio. Suppor...

📝 → 🔊 • text-to-speech • multilingual • 161.0K runs

🤖 Model 📝 → 🔊

jichengdu/cosyvoice

Clone a speaker's voice and synthesize speech from text, including cross-lingual and mixed-lingual output. Accepts refer...

📝 → 🔊 • text-to-speech • voice-cloning • multilingual • 1.7K runs

🤖 Model 📝 → 🔊

alphanumericuser/kokoro-82m

Generate speech audio from text with multilingual preset voices. Accepts text plus optional language selection, voice pr...

📝 → 🔊 • text-to-speech • multilingual • 4.0M runs

🤖 Model 📝 → 🖼️

nvidia/sana

Generates high-resolution images up to 4096x4096 from text prompts with fast generation speed. Uses linear diffusion tra...

📝 → 🖼️ • text-to-image • 257.4K runs