vaibhavs10/incredibly-fast-whisper ❓🖼️📝🔢✓ → ❓

▶️ 27.1M runs 📅 Nov 2023 ⚙️ Cog 0.9.4 🔗 GitHub ⚖️ License

speaker-diarization speech-to-text speech-translation

About

whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗

Example Output

Output

{"text":" the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit hours fly by much too soon. The room was crowded with a mild wab. The room was crowded with a wild mob. This strong arm shall shield your honour. She blushed when he gave her a white orchid The beetle droned in the hot June sun","chunks":[{"text":" the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit hours fly by much too soon. The room was crowded","timestamp":[0,29.72]},{"text":" with a mild wab. The room was crowded with a wild mob. This strong arm shall shield your","timestamp":[29.72,38.98]},{"text":" honour. She blushed when he gave her a white orchid The beetle droned in the hot June sun","timestamp":[38.98,48.52]}]}

Performance Metrics

2.75s Prediction Time

2.72s Total Time

All Input Parameters

{
  "task": "transcribe",
  "audio": "https://replicate.delivery/pbxt/Js2Fgx9MSOCzdTnzHQLJXj7abLp3JLIG3iqdsYXV24tHIdk8/OSR_uk_000_0050_8k.wav",
  "batch_size": 64,
  "return_timestamps": true
}

Input Parameters

task Default: transcribe: Task to perform: transcribe or translate to another language.
audio (required) Type: string: Audio file
hf_token Type: string: Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.
language Default: None: Language spoken in the audio, specify 'None' to perform language detection.
timestamp Default: chunk: Whisper supports both chunked as well as word level timestamps.
batch_size Type: integerDefault: 24: Number of parallel batches you want to compute. Reduce if you face OOMs.
diarise_audio Type: booleanDefault: false: Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.

Output Schema

Output

Version Details

Version ID: 3ab86df6c8f54c11309d4d1f930ac292bad43ace52d10c80d87eb258b3c9f79c
Version Created: February 16, 2024

Run on Replicate →