
zsxkib/audio-flamingo-3
Answer questions about audio with text output, performing step-by-step reasoning across speech, music, and sound effects...
Found 7 models (showing 1-7)
Answer questions about audio with text output, performing step-by-step reasoning across speech, music, and sound effects...
Generate mouth cue JSON from speech audio for lipβsync animation. Accepts base64-encoded audio (MP3, WAV, FLAC, AAC, OGG...
Segment an audio recording by speaker and return time-stamped speaker labels as JSON. Accepts an audio file as input; op...
Transcribe long-form audio to text with optional timestamps and prompt-driven analysis. Accept an audio file and return...
Generate music tags from audio files using state-of-the-art CNN-based models. Supports multiple model variants including...
Separate audio mixtures into individual tracks including bass, drums, vocals, and other components.
Transcribe and analyze audio content with Canary-Qwen-2.5B, a speech-to-text model that provides perfect transcription w...