speech-to-text AI Models - Page 3

hnesk/whisper-wordtimestamps

Transcribe audio to text with per-word timestamps. Outputs the full transcript, time-coded segments, detected language,...

speech-to-text • word-timestamps • language-detection • 1.5M runs

🤖 Model

wglodell/cog-whisperx-withprompt

Transcribes audio files to text using WhisperX with support for an initial prompt to guide the transcription process. Fe...

speech-to-text • 340.7K runs

🤖 Model

dashed/whisperx-subtitles-replicate

Generate synchronized SRT subtitles from an audio input. Transcribe with WhisperX (faster-whisper-large-v3) and align wo...

speech-to-text • speaker-diarization • subtitle-generation • 23.2K runs

🤖 Model

aqasemi/whisper-jax

Transcribe speech to text from an audio input. Uses OpenAI Whisper Large-v2 implemented in JAX for up to 15x faster infe...

speech-to-text • language-detection • 122.5K runs

🤖 Model

mercurio005/whisperx-spanish

Transcribe Spanish audio to text with optional speaker diarization and timestamps. Accepts an audio input and returns ei...

speech-to-text • speaker-diarization • spanish • 44.8K runs

🤖 Model

villesau/whisper-timestamped

Transcribe or translate speech from audio into text with word-level timestamps and confidence scores. Support multilingu...

speech-to-text • word-level-timestamps • 5.9K runs

🤖 Model

cjwbw/whisper

Transcribe multilingual audio to text and subtitles. Accepts an audio file and returns a transcription, timestamped segm...

speech-to-text • subtitle-generation • 54.9K runs

🤖 Model

awerks/whisperx

Transcribe or translate audio to text with word-level timestamps and optional speaker diarization. Accept audio input an...

speech-to-text • speaker-diarization • word-level-timestamps • 14.7K runs

🤖 Model

nateraw/whisper-large-v3

Transcribe speech to text from audio input. Accepts an audio file and optionally a source language, returns a transcript...

speech-to-text • multilingual • 4.4K runs

🤖 Model

erium/whisperx

Transcribe audio into text with word-level timestamps and optional speaker diarization. Supports multilingual speech-to-...

speech-to-text • speaker-diarization • 5.1K runs

🤖 Model

stayallive/whisper-subtitles

Generate subtitles (.srt and .vtt) from audio files. Transcribe speech with Whisper via faster-whisper (CTranslate2) and...

speech-to-text • subtitle-generation • 5.3K runs

🤖 Model

nicknaskida/incredibly-fast-whisper

Transcribe and optionally translate speech from audio to text at high speed. Leverage Whisper Large v3 via Hugging Face...

speech-to-text • speaker-diarization • 329 runs

🤖 Model

cjwbw/whisper-downloadable-subtitles

Generate subtitles from audio. Accepts an audio file and returns a transcript with detected language, optional English t...

speech-to-text • subtitle-generation • 2.6K runs

🤖 Model

zsxkib/whisper-lazyloading

Transcribe speech from audio into text with Whisper large-v3, supporting multilingual transcription, automatic language...

speech-to-text • subtitle-generation • 142 runs

🤖 Model

cutzudev/whisper-x

Transcribe audio to text with optional translation, word-level timestamps, and speaker diarization. Accept an audio inpu...

speech-to-text • speaker-diarization • 405 runs

🤖 Model

collectiveai-team/whisper-wordtimestamps

Transcribe speech to text with optional word-level timestamps. Accepts an audio input and returns a transcript plus dete...

speech-to-text • word-level-timestamps • language-detection • 1.3K runs

🤖 Model

mattsegal/incredibly-fast-whisper-distil-medium-en

Transcribe English speech from an audio input into text. Uses OpenAI Whisper medium.en and parallel batching (configurab...

speech-to-text • 891 runs

🤖 Model 🎥

hovevideo/stable-whisper

Transcribe audio or video to text. Accepts an audio or video input and returns a JSON transcript or ASS subtitles, lever...

🎥 • speech-to-text • video-to-text • video-auto-captioning • 173 runs

🤖 Model

cjwbw/distil-whisper

Transcribe speech from audio to text. Leverage Distil-Whisper variants (distil-large-v2, distil-medium.en) that run up t...

speech-to-text • 277 runs

🤖 Model

venkr/whisperx-diarization

Transcribe audio to text with optional speaker diarization. Uses WhisperX (Whisper Large V2) for transcription and Pyann...

speech-to-text • speaker-diarization • 352 runs