daanelson/whisperx 🖼️✓🔢 → ❓

▶️ 98.9K runs 📅 Jun 2023 ⚙️ Cog 0.8.0-beta8 🔗 GitHub 📄 Paper ⚖️ License

Performance

2.7sTypical run time

98.9KTotal runs

Accelerated transcription of audio using WhisperX

[object Object][object Object]

2.72s Prediction Time

2.70s Total Time

All Input Parameters

{
  "audio": "https://replicate.delivery/pbxt/J5r78wKSymorzW9idAbbbJ7iXQl9GddZTwfdX5OlLJW2hLR2/OSR_uk_000_0050_8k.wav",
  "batch_size": 32
}

Input Parameters

audio (required) Type: string: Audio file
debug Type: booleanDefault: false: Print out memory usage information.
only_text Type: booleanDefault: false: Set if you only want to return text; otherwise, segment metadata will be returned as well.
batch_size Type: integerDefault: 32: Parallelization of input audio transcription
align_output Type: booleanDefault: false: Use if you need word-level timing and not just batched transcription. Only works for English atm

Output Schema

Output

Version Details

Version ID: 9aa6ecadd30610b81119fc1b6807302fd18ca6cbb39b3216f430dcf23618cedd
Version Created: June 30, 2023