mattsegal/incredibly-fast-whisper-distil-large-v2
Transcribe speech from an audio file to text. Leverage OpenAI Whisper large-v3 with an implementation optimized for fast...
Found 73 models (showing 61-73)
Transcribe speech from an audio file to text. Leverage OpenAI Whisper large-v3 with an implementation optimized for fast...
Transcribe audio to time-aligned SRT subtitles. Accepts an audio file as input and returns an SRT subtitle file with tim...
Transcribe and optionally translate multilingual audio to English text. Accepts an audio file and returns transcripts as...
Transcribe Hindi speech from audio into text. Takes an audio file as input and returns Hindi transcripts for tasks like...
Prepare datasets for fine-tuning Whisper ASR models. Accepts either tarballs of audio files and matching text transcript...
Transcribe and understand audio with Voxtral Mini 3B, an advanced model that builds upon Ministral-3B. It excels in spee...
Transcribe speech to text from an audio input. Optionally translate to English, perform speaker diarization, and bias re...
Transcribe multilingual audio with speaker diarization and channel separation. Accepts an audio file and outputs text tr...
Generate text responses from text, image, and audio inputs. Perform image captioning and visual question answering, OCR,...
Auto-caption videos with TikTok-style on-screen subtitles. Transcribe speech using Whisper large-v3 with automatic langu...
Add autogenerated, stylized subtitles to a video. Input a video (optional: background music and/or a wordβlevel transcri...
Transcribe speech to text across 1,693 languages. Accepts short audio clips and returns a text transcript, with automati...
Transcribe speech to text from short audio clips in 1,693 languages. Accept audio input and optionally a specified langu...