ai-forever/kandinsky-2.2
Generate images from text prompts in multiple languages. Accepts a text prompt (with optional negative prompt) and retur...
Found 97 models (showing 1-20)
Generate images from text prompts in multiple languages. Accepts a text prompt (with optional negative prompt) and retur...
Caption and answer questions about videos. Accepts a video and a text prompt and returns text outputs such as descriptio...
Embed text into dense vectors for semantic search, retrieval/reranking, clustering, classification, recommendations, and...
Generate multilingual text embeddings for semantic search, retrieval, and clustering. Takes one or more texts as input a...
Generate text based on prompts using a 72-billion parameter transformer-based language model. Supports multilingual text...
Generate text responses based on prompts using a 7B parameter transformer-based decoder-only language model. Supports te...
Generate speech audio from text with low latency for real-time applications. Choose from 300+ prebuilt voices or supply...
Generate videos with synchronized audio from input images and text prompts. Uses Alibaba's WAN 2.5 model to create video...
Generate multilingual text embeddings for semantic search, similarity, and retrieval. Accepts a list of texts and return...
Converts text and images into vector embeddings for similarity search and multimodal analysis. Supports 89 languages for...
Generate and chat in multiple languages from text prompts. Accepts a prompt and optional system prompt and returns text....
Transcribe speech from audio into text. Perform multilingual automatic speech recognition with language detection and op...
Generate expressive, multilingual speech audio from text input. Produce zero-shot multi-speaker dialogues, emotional del...
Clone a voice from a short reference clip and synthesize multilingual speech from text. Provide a text prompt and at lea...
Generate text responses from conversational prompts and instructions with support for 128K context length. Supports adva...
Generate speech audio from text with selectable multilingual voices. Accepts text input, a preset voice, and a speed mul...
Generate speech audio from text across 25 languages. Accepts text and a language selection; returns spoken audio. Suppor...
Clone a speaker's voice and synthesize speech from text, including cross-lingual and mixed-lingual output. Accepts refer...
Generate speech audio from text with multilingual preset voices. Accepts text plus optional language selection, voice pr...
Generates high-resolution images up to 4096x4096 from text prompts with fast generation speed. Uses linear diffusion tra...