
jigsawstack/speech-to-text
Transcribe speech to text from audio or video inputs. Auto-detect language or specify one, and optionally translate the...
Found 38 models (showing 21-38)
Transcribe speech to text from audio or video inputs. Auto-detect language or specify one, and optionally translate the...
Transcribe speech to text from audio files or HLS m3u8 streams with optional word-level timestamps. Accept audio uploads...
Transcribe and analyze audio into text. Accepts an audio input with an optional language code and returns text transcrip...
Transcribe speech to text with speaker diarization for noisy, multi-speaker audio. Accepts an audio file (upload, URL, o...
Transcribe speech from audio into text. Support multilingual transcription with automatic language identification and op...
Transcribe speech and analyze audio with question-answering and summarization from an input audio file, returning text....
Convert documents, web pages, images, and audio into Markdown text. Accept a single file input (PDF, DOCX, PPTX, XLSX, H...
Translate speech and text across 100+ languages, returning text and optionally translated speech audio. Supports speech-...
Transcribe long-form audio into timestamped text. Process multiple audio chunks with WhisperX, then merge results into a...
Transcribe multilingual audio to text with time-aligned segments. Accepts an audio file and outputs segment- and word-le...
Add karaoke-style subtitles to a video. Takes a video as input, auto-transcribes speech with Whisper, and outputs a capt...
Transcribe and analyze audio content with Canary-Qwen-2.5B, a speech-to-text model that provides perfect transcription w...
Transcribe speech to text with optional word-level timestamps and segment metadata. Accepts an audio input and outputs e...
Transcribe Belarusian speech from an audio file into text. Accepts a single audio input and returns a plain-text transcr...
Transcribe two-speaker phone calls from separate operator and customer audio tracks into a time-stamped, speaker-labeled...
Generate timestamped subtitles from an audio or video file. Transcribe speech to text and return structured segments wit...
Transcribe English audio and separate speakers, returning a timestamped transcript with speaker labels. Accepts an audio...
Transcribe audio to text with speaker diarization and word-level timestamps. Takes an audio file as input and returns a...