vaibhavs10/incredibly-fast-whisper ❓🖼️📝🔢✓ → ❓

▶️ 17.8M runs 📅 Nov 2023 ⚙️ Cog 0.9.4 🔗 GitHub ⚖️ License
speaker-diarization speech-to-text speech-translation

About

whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗

Example Output

Output

{"text":" the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit hours fly by much too soon. The room was crowded with a mild wab. The room was crowded with a wild mob. This strong arm shall shield your honour. She blushed when he gave her a white orchid The beetle droned in the hot June sun","chunks":[{"text":" the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit hours fly by much too soon. The room was crowded","timestamp":[0,29.72]},{"text":" with a mild wab. The room was crowded with a wild mob. This strong arm shall shield your","timestamp":[29.72,38.98]},{"text":" honour. She blushed when he gave her a white orchid The beetle droned in the hot June sun","timestamp":[38.98,48.52]}]}

Performance Metrics

2.75s Prediction Time
2.72s Total Time
All Input Parameters
{
  "task": "transcribe",
  "audio": "https://replicate.delivery/pbxt/Js2Fgx9MSOCzdTnzHQLJXj7abLp3JLIG3iqdsYXV24tHIdk8/OSR_uk_000_0050_8k.wav",
  "batch_size": 64,
  "return_timestamps": true
}
Input Parameters
task Default: transcribe
Task to perform: transcribe or translate to another language.
audio (required) Type: string
Audio file
hf_token Type: string
Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.
language Default: None
Language spoken in the audio, specify 'None' to perform language detection.
timestamp Default: chunk
Whisper supports both chunked as well as word level timestamps.
batch_size Type: integerDefault: 24
Number of parallel batches you want to compute. Reduce if you face OOMs.
diarise_audio Type: booleanDefault: false
Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.
Output Schema

Output

Version Details
Version ID
3ab86df6c8f54c11309d4d1f930ac292bad43ace52d10c80d87eb258b3c9f79c
Version Created
February 16, 2024
Run on Replicate →