meronym/speaker-transcription 🖼️📝 → 🖼️

▶️ 28.3K runs 📅 Apr 2023 ⚙️ Cog 0.6.1 🔗 GitHub ⚖️ License
speaker-diarization speech-to-text

About

Whisper transcription plus speaker diarization

Example Output

Output

Example output

Performance Metrics

65.26s Prediction Time
65.57s Total Time
Input Parameters
audio (required) Type: string
Audio file
prompt Type: string
Optional text to provide as a prompt for each Whisper model call.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
pre-processing audio file...
diarizing audio file...
post-processing diarization...
transcribing segments...
transcribing segment 0:00:00.497812 to 0:00:09.779063
transcribing segment 0:00:09.863438 to 0:03:34.962188
Version Details
Version ID
9950ee297f0fdad8736adf74ada54f63cc5b5bdfd5b2187366910ed5baf1a7a1
Version Created
April 27, 2023
Run on Replicate →