konieshadow/speaker-diarization
Separate speakers in audio into labeled time segments. Accepts an audio file and optional constraints on the number of s...
Found 27 models (showing 1-20)
Separate speakers in audio into labeled time segments. Accepts an audio file and optional constraints on the number of s...
Transcribe or translate speech to text from audio input. Run Whisper Large v3 with batched inference and Flash Attention...
Transcribe audio with speaker diarization. Takes an audio input and returns a text transcript with per-speaker labels, s...
Transcribe audio to text with word-level timestamps and optional speaker diarization. Accepts an audio file with optiona...
Transcribe and structure spoken conversations from an audio input. Accept an audio file with optional session context (u...
Transcribe audio to text with fast, batched speech recognition. Accept an audio file as input and return a transcript wi...
Transcribe speech from audio or video into text. Outputs a full transcript with optional per-segment timestamps and spea...
Transcribe and diarize noisy multi-speaker audio. Accept audio files or base64 and output structured segments with text,...
Transcribe long-form audio from multiple chunks into timestamped text. Accepts an array of audio chunks with total durat...
Identify who spoke when in an audio file. Takes a single audio recording as input and returns a diarization JSON with sp...
Cluster speech segments by speaker in an audio recording. Takes an audio input and a JSON list of segment records (start...
Identify and segment speakers in audio recordings. Takes an audio file as input and returns JSON with speaker-labeled ti...
Transcribe two-speaker phone calls with timestamps and speaker labels. Accepts two audio tracks (operator and customer)...
Transcribe English speech from an audio input and label speakers with diarization. Return structured JSON with timestamp...
Transcribe audio to text with speaker diarization and word-level timestamps. Takes an audio file as input and returns a...
Transcribe or translate speech from audio files and videos to text. Accept audio or video input and return a transcript...
Transcribe hours-long audio to text with WhisperX large-v3, generating segment timestamps and optional word-level alignm...
Generate synchronized SRT subtitles from an audio input. Transcribe with WhisperX (faster-whisper-large-v3) and align wo...
Transcribe Spanish audio to text with optional speaker diarization and timestamps. Accepts an audio input and returns ei...
Transcribe or translate audio to text with word-level timestamps and optional speaker diarization. Accept audio input an...