hovevideo/stable-whisper 📝❓ → 🖼️

▶️ 173 runs 📅 May 2024 ⚙️ Cog 0.9.7 🔗 GitHub
speech-to-text video-auto-captioning

About

Transcribe audios using OpenAI's Whisper with stabilizing timestamps by stable-ts python package.

Example Output

Output

Example output

Performance Metrics

8.87s Prediction Time
110.33s Total Time
Input Parameters
url (required) Type: string
Audio or video URL
output_format Default: json
Output format: ass (ASS subtitles) or json (transcription in JSON format).
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Transcribe:   0%|          | 0/52.42 [00:00<?, ?sec/s]
Detected language: english
Transcribe:   0%|          | 0/52.42 [00:01<?, ?sec/s]
Transcribe:  55%|█████▌    | 28.88/52.42 [00:06<00:05,  4.61sec/s]
Transcribe:  89%|████████▉ | 46.84/52.42 [00:07<00:00,  6.95sec/s]
Transcribe: 100%|█████████▉| 52.41/52.42 [00:08<00:00,  6.87sec/s]
Transcribe: 100%|█████████▉| 52.41/52.42 [00:08<00:00,  6.37sec/s]
Saved: /src/input.ass
Saved: /src/input.json
Version Details
Version ID
a1697797eeccbcfc1955282a28b9aa1120335841bc8717911da8cd1e07ffefab
Version Created
May 9, 2024
Run on Replicate →