daanelson/whisperx 🖼️✓🔢 → ❓

▶️ 89.1K runs 📅 Jun 2023 ⚙️ Cog 0.8.0-beta8 🔗 GitHub 📄 Paper ⚖️ License
speaker-diarization speech-to-text

About

Accelerated transcription of audio using WhisperX

Example Output

Output

[object Object][object Object]

Performance Metrics

2.72s Prediction Time
2.70s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/J5r78wKSymorzW9idAbbbJ7iXQl9GddZTwfdX5OlLJW2hLR2/OSR_uk_000_0050_8k.wav",
  "batch_size": 32
}
Input Parameters
audio (required) Type: string
Audio file
debug Type: booleanDefault: false
Print out memory usage information.
only_text Type: booleanDefault: false
Set if you only want to return text; otherwise, segment metadata will be returned as well.
batch_size Type: integerDefault: 32
Parallelization of input audio transcription
align_output Type: booleanDefault: false
Use if you need word-level timing and not just batched transcription. Only works for English atm
Output Schema

Output

Version Details
Version ID
9aa6ecadd30610b81119fc1b6807302fd18ca6cbb39b3216f430dcf23618cedd
Version Created
June 30, 2023
Run on Replicate →