jacksoby/whisperx 🖼️ → ❓
About
Audio to text transcriptions
Example Output
Output
{"segments":[{"end":2.937,"text":" This is the final build, screenshot.","start":0.031,"words":[{"end":0.844,"word":"This","score":0.944,"start":0.031},{"end":0.986,"word":"is","score":0.692,"start":0.925},{"end":1.108,"word":"the","score":0.994,"start":1.047},{"end":1.413,"word":"final","score":0.901,"start":1.149},{"end":1.718,"word":"build,","score":0.914,"start":1.454},{"end":2.937,"word":"screenshot.","score":0.767,"start":2.084}]}],"word_segments":[{"end":0.844,"word":"This","score":0.944,"start":0.031},{"end":0.986,"word":"is","score":0.692,"start":0.925},{"end":1.108,"word":"the","score":0.994,"start":1.047},{"end":1.413,"word":"final","score":0.901,"start":1.149},{"end":1.718,"word":"build,","score":0.914,"start":1.454},{"end":2.937,"word":"screenshot.","score":0.767,"start":2.084}]}
Performance Metrics
3.58s
Prediction Time
76.57s
Total Time
Input Parameters
- audio_file (required)
- Audio file
Output Schema
- segments
- Segments
- word_segments
- Word Segments
Example Execution Logs
Warning: audio is shorter than 30s, language detection may be inaccurate. Detected language: en (0.99) in first 30s of audio... Downloading: "https://download.pytorch.org/torchaudio/models/wav2vec2_fairseq_base_ls960_asr_ls960.pth" to /root/.cache/torch/hub/checkpoints/wav2vec2_fairseq_base_ls960_asr_ls960.pth 0%| | 0.00/360M [00:00<?, ?B/s] 2%|▏ | 5.88M/360M [00:00<00:06, 61.2MB/s] 5%|▌ | 18.2M/360M [00:00<00:03, 101MB/s] 9%|▊ | 31.1M/360M [00:00<00:02, 115MB/s] 14%|█▍ | 49.6M/360M [00:00<00:02, 146MB/s] 25%|██▍ | 88.6M/360M [00:00<00:01, 240MB/s] 38%|███▊ | 135M/360M [00:00<00:00, 324MB/s] 54%|█████▎ | 193M/360M [00:00<00:00, 415MB/s] 68%|██████▊ | 247M/360M [00:00<00:00, 462MB/s] 81%|████████▏ | 293M/360M [00:00<00:00, 468MB/s] 94%|█████████▎| 338M/360M [00:01<00:00, 463MB/s] 100%|██████████| 360M/360M [00:01<00:00, 357MB/s]
Version Details
- Version ID
b484c1fc8bb7096df7fea8c9628adee66cedc6088d1cbcc56a72674df05c5c24
- Version Created
- July 12, 2025