ai-forever/kandinsky-2.2
Generate images from text prompts in multiple languages. Accepts a text prompt (with optional negative prompt) and retur...
Found 85 models (showing 1-20)
Generate images from text prompts in multiple languages. Accepts a text prompt (with optional negative prompt) and retur...
Caption and answer questions about videos. Accepts a video and a text prompt and returns text outputs such as descriptio...
Embed text into dense vectors for semantic search, retrieval/reranking, clustering, classification, recommendations, and...
Generate multilingual text embeddings for semantic search, retrieval, and clustering. Takes one or more texts as input a...
Generate multilingual text and code from a text prompt. Handle chat-style interaction, question answering, document summ...
Generate multilingual text for chat, question answering, code generation and explanation, translation, summarization, an...
Generate speech audio from text with low latency for real-time applications. Choose from 300+ prebuilt voices or supply...
Generate videos with audio from an input image and a text prompt. Optionally upload an audio track (3β30 s) for voice/mu...
Generate multilingual text embeddings for semantic search, similarity, and retrieval. Accepts a list of texts and return...
Create multilingual text and image embeddings for cross-modal retrieval and semantic search. Accepts text (up to 8192 to...
Generate and chat in multiple languages from text prompts. Accepts a prompt and optional system prompt and returns text....
Transcribe speech from audio into text. Perform multilingual automatic speech recognition with language detection and op...
Generate expressive, multilingual speech audio from text input. Produce zero-shot multi-speaker dialogues, emotional del...
Clone a voice from a short reference clip and synthesize multilingual speech from text. Provide a text prompt and at lea...
Generate text for chat, question answering, summarization, translation, and code from prompts or multi-turn messages. Le...
Generate speech audio from text with selectable multilingual voices. Accepts text input, a preset voice, and a speed mul...
Generate speech audio from text across 25 languages. Accepts text and a language selection; returns spoken audio. Suppor...
Clone a speaker's voice and synthesize speech from text, including cross-lingual and mixed-lingual output. Accepts refer...
Generate speech audio from text with multilingual preset voices. Accepts text plus optional language selection, voice pr...
Generate images from text prompts. Produce fast 512pxβ4096Γ4096 outputs with strong textβimage alignment. Accept English...