konieshadow/speaker-diarization
Segment speakers in an audio file and return time-stamped speaker labels (diarization). Accepts audio plus optional num_...
Found 27 models (showing 1-20)
Segment speakers in an audio file and return time-stamped speaker labels (diarization). Accepts audio plus optional num_...
Transcribe or translate speech to text from audio input. Run Whisper Large v3 with batched inference and Flash Attention...
Transcribe audio with speaker diarization. Takes an audio input and returns a text transcript with per-speaker labels, s...
Transcribe audio to text with word-level timestamps and optional speaker diarization. Accepts an audio file with optiona...
Transcribe and structure spoken conversations from an audio input. Accept an audio file with optional session context (u...
Transcribe audio to text with fast, batched speech recognition. Accept an audio file as input and return a transcript wi...
Transcribe speech from audio or video into text. Outputs a full transcript with optional per-segment timestamps and spea...
Transcribe and diarize noisy multi-speaker audio. Accept audio files or base64 and output structured segments with text,...
Transcribe long-form audio from multiple chunks into timestamped text. Accepts an array of audio chunks with total durat...
Identify who spoke when in an audio file. Takes a single audio recording as input and returns a diarization JSON with sp...
Separate speakers in audio recordings. Accept an audio file and a JSON list of time segments (start, duration), and clus...
Segment speakers in audio recordings. Take an audio file and return time-stamped speech segments labeled by speaker, the...
Transcribe two-speaker phone calls with timestamps and speaker labels. Accepts two audio tracks (operator and customer)...
Transcribe English speech from an audio input and label speakers with diarization. Return structured JSON with timestamp...
Transcribe audio to text with speaker diarization and word-level timestamps. Takes an audio file as input and returns a...
Transcribe or translate speech from audio files and videos to text. Accept audio or video input and return a transcript...
Transcribe hours-long audio to text with WhisperX large-v3, generating segment timestamps and optional word-level alignm...
Generate synchronized SRT subtitles from an audio input. Transcribe with WhisperX (faster-whisper-large-v3) and align wo...
Transcribe Spanish audio to text with optional speaker diarization and timestamps. Accepts an audio input and returns ei...
Transcribe or translate audio to text with word-level timestamps and optional speaker diarization. Accept audio input an...