turian/insanely-fast-whisper-with-video 📝❓🖼️🔢✓ → ❓

▶️ 7.5M runs 📅 Jan 2024 ⚙️ Cog 0.8.6 🔗 GitHub ⚖️ License
speaker-diarization speech-to-text video-auto-captioning

About

whisper-large-v3, incredibly fast, with video transcription

Example Output

Output

{"text":" Do not fing touch it. Here is maybe the most overlooked feature or factor in the success or failure of a steak, particularly a thick steak, but it's true of all meat. This magical period immediately following its removal from the heat, it should rest on the board, meaning sit there at room temperature for five to seven minutes, at which point stay away from it. Don't touch it. Don't poke it. Don't slice it to look inside. Do not start slicing it into slices right away. What's going on inside is it is continuing to cook, but even more importantly, the juices are distributing themselves in a truly wonderful alignment. That's why if you cut into a steak too quickly off the barbecue, you get this sort of bullseye pattern instead of what it should be, a gentle graduation from red to various hues of pink to the outer crust. All the difference in the world between a good steak and a totally messed up steak is going on in that period of time that you're just doing nothing. Nothing. You want to find a good hot sizzling either grill or pan, get a good crust and sear on the outside. You want to finish it either in the oven or all the way on the grill. And then just let it sit. Don't wrap it in foil, don't cover it, don't poke it, don't prod it, don't even look at it. Just let it sit there, leave it alone, and you will be rewarded.","chunks":[{"text":" Do not fing touch it.","timestamp":[0,2.62]},{"text":" Here is maybe the most overlooked feature or factor in the success or failure of a steak,","timestamp":[9.46,16.9]},{"text":" particularly a thick steak, but it's true of all meat.","timestamp":[17,19.36]},{"text":" This magical period immediately following its removal from the heat,","timestamp":[19.78,24]},{"text":" it should rest","timestamp":[24,25.4]},{"text":" on the board, meaning sit there at room temperature for five to seven minutes, at which point","timestamp":[25.4,31.32]},{"text":" stay away from it.","timestamp":[31.32,33.5]},{"text":" Don't touch it.","timestamp":[33.5,34.5]},{"text":" Don't poke it.","timestamp":[34.5,35.72]},{"text":" Don't slice it to look inside.","timestamp":[35.72,37.88]},{"text":" Do not start slicing it into slices right away.","timestamp":[37.88,41.18]},{"text":" What's going on inside is it is continuing to cook, but even more importantly, the juices are distributing themselves in a truly wonderful","timestamp":[41.18,51.1]},{"text":" alignment. That's why if you cut into a steak too quickly off the barbecue, you","timestamp":[51.1,55.26]},{"text":" get this sort of bullseye pattern instead of what it should be, a gentle","timestamp":[55.26,59.84]},{"text":" graduation from red to various hues of pink to the outer crust.","timestamp":[59.84,66.16]},{"text":" All the difference in the world between a good steak and a totally messed up steak is","timestamp":[66.16,70.96]},{"text":" going on in that period of time that you're just doing nothing.","timestamp":[70.96,75.96]},{"text":" Nothing.","timestamp":[75.96,76.96]},{"text":" You want to find a good hot sizzling either grill or pan, get a good crust and sear on","timestamp":[76.96,82.86]},{"text":" the outside.","timestamp":[82.86,83.86]},{"text":" You want to finish it either in the oven or all the way on the grill.","timestamp":[83.86,87.88]},{"text":" And then just let it sit.","timestamp":[87.88,90.2]},{"text":" Don't wrap it in foil, don't cover it, don't poke it, don't prod it, don't even look at it.","timestamp":[90.2,94.98]},{"text":" Just let it sit there, leave it alone, and you will be rewarded.","timestamp":[94.98,99.5]}]}

Performance Metrics

6.52s Prediction Time
19.37s Total Time
All Input Parameters
{
  "url": "https://www.youtube.com/watch?v=2ua_v4BA3qM",
  "task": "transcribe",
  "timestamp": "chunk",
  "batch_size": 64,
  "diarise_audio": false
}
Input Parameters
url Type: string
Video URL for yt-dlp to download the audio from. Either this or audio must be provided.
task Default: transcribe
Task to perform: transcribe or translate to another language. (default: transcribe).
audio Type: string
Audio file. Either this or url must be provided.
hf_token Type: string
Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.
language Type: string
Optional. Language spoken in the audio, specify None to perform language detection.
timestamp Default: chunk
Whisper supports both chunked as well as word level timestamps. (default: chunk).
batch_size Type: integerDefault: 64
Number of parallel batches you want to compute. Reduce if you face OOMs. (default: 64).
diarise_audio Type: booleanDefault: false
Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.
Output Schema

Output

Example Execution Logs
[youtube] Extracting URL: https://www.youtube.com/watch?v=2ua_v4BA3qM
[youtube] 2ua_v4BA3qM: Downloading webpage
[youtube] 2ua_v4BA3qM: Downloading ios player API JSON
[youtube] 2ua_v4BA3qM: Downloading android player API JSON
[youtube] 2ua_v4BA3qM: Downloading m3u8 information
[info] 2ua_v4BA3qM: Downloading 1 format(s): 251
[download] Destination: 1f1c25825fc2491fb6315ec79f759ff4.webm
[download]   0.1% of    1.46MiB at  Unknown B/s ETA Unknown
[download]   0.2% of    1.46MiB at  Unknown B/s ETA Unknown
[download]   0.5% of    1.46MiB at  Unknown B/s ETA Unknown
[download]   1.0% of    1.46MiB at   10.98MiB/s ETA 00:00
[download]   2.1% of    1.46MiB at   17.64MiB/s ETA 00:00
[download]   4.2% of    1.46MiB at   29.24MiB/s ETA 00:00
[download]   8.5% of    1.46MiB at   49.04MiB/s ETA 00:00
[download]  17.0% of    1.46MiB at   82.09MiB/s ETA 00:00
[download]  34.1% of    1.46MiB at  133.52MiB/s ETA 00:00
[download]  68.3% of    1.46MiB at  207.44MiB/s ETA 00:00
[download] 100.0% of    1.46MiB at  253.97MiB/s ETA 00:00
[download] 100% of    1.46MiB in 00:00:00 at 10.46MiB/s
[ExtractAudio] Destination: 1f1c25825fc2491fb6315ec79f759ff4.mp3
Deleting original file 1f1c25825fc2491fb6315ec79f759ff4.webm (pass -k to keep)
Downloaded audio from the video URL https://www.youtube.com/watch?v=2ua_v4BA3qM
Voila!✨ Your file has been transcribed!
Removing downloaded audio file
max gpu memory allocated over runtime: 4.50 GB
Version Details
Version ID
4f41e90243af171da918f04da3e526b2c247065583ea9b757f2071f573965408
Version Created
January 8, 2024
Run on Replicate →