turian/insanely-fast-whisper-with-video 📝❓🖼️🔢✓ → ❓
About
whisper-large-v3, incredibly fast, with video transcription

Example Output
Output
{"text":" Do not fing touch it. Here is maybe the most overlooked feature or factor in the success or failure of a steak, particularly a thick steak, but it's true of all meat. This magical period immediately following its removal from the heat, it should rest on the board, meaning sit there at room temperature for five to seven minutes, at which point stay away from it. Don't touch it. Don't poke it. Don't slice it to look inside. Do not start slicing it into slices right away. What's going on inside is it is continuing to cook, but even more importantly, the juices are distributing themselves in a truly wonderful alignment. That's why if you cut into a steak too quickly off the barbecue, you get this sort of bullseye pattern instead of what it should be, a gentle graduation from red to various hues of pink to the outer crust. All the difference in the world between a good steak and a totally messed up steak is going on in that period of time that you're just doing nothing. Nothing. You want to find a good hot sizzling either grill or pan, get a good crust and sear on the outside. You want to finish it either in the oven or all the way on the grill. And then just let it sit. Don't wrap it in foil, don't cover it, don't poke it, don't prod it, don't even look at it. Just let it sit there, leave it alone, and you will be rewarded.","chunks":[{"text":" Do not fing touch it.","timestamp":[0,2.62]},{"text":" Here is maybe the most overlooked feature or factor in the success or failure of a steak,","timestamp":[9.46,16.9]},{"text":" particularly a thick steak, but it's true of all meat.","timestamp":[17,19.36]},{"text":" This magical period immediately following its removal from the heat,","timestamp":[19.78,24]},{"text":" it should rest","timestamp":[24,25.4]},{"text":" on the board, meaning sit there at room temperature for five to seven minutes, at which point","timestamp":[25.4,31.32]},{"text":" stay away from it.","timestamp":[31.32,33.5]},{"text":" Don't touch it.","timestamp":[33.5,34.5]},{"text":" Don't poke it.","timestamp":[34.5,35.72]},{"text":" Don't slice it to look inside.","timestamp":[35.72,37.88]},{"text":" Do not start slicing it into slices right away.","timestamp":[37.88,41.18]},{"text":" What's going on inside is it is continuing to cook, but even more importantly, the juices are distributing themselves in a truly wonderful","timestamp":[41.18,51.1]},{"text":" alignment. That's why if you cut into a steak too quickly off the barbecue, you","timestamp":[51.1,55.26]},{"text":" get this sort of bullseye pattern instead of what it should be, a gentle","timestamp":[55.26,59.84]},{"text":" graduation from red to various hues of pink to the outer crust.","timestamp":[59.84,66.16]},{"text":" All the difference in the world between a good steak and a totally messed up steak is","timestamp":[66.16,70.96]},{"text":" going on in that period of time that you're just doing nothing.","timestamp":[70.96,75.96]},{"text":" Nothing.","timestamp":[75.96,76.96]},{"text":" You want to find a good hot sizzling either grill or pan, get a good crust and sear on","timestamp":[76.96,82.86]},{"text":" the outside.","timestamp":[82.86,83.86]},{"text":" You want to finish it either in the oven or all the way on the grill.","timestamp":[83.86,87.88]},{"text":" And then just let it sit.","timestamp":[87.88,90.2]},{"text":" Don't wrap it in foil, don't cover it, don't poke it, don't prod it, don't even look at it.","timestamp":[90.2,94.98]},{"text":" Just let it sit there, leave it alone, and you will be rewarded.","timestamp":[94.98,99.5]}]}
Performance Metrics
6.52s
Prediction Time
19.37s
Total Time
All Input Parameters
{ "url": "https://www.youtube.com/watch?v=2ua_v4BA3qM", "task": "transcribe", "timestamp": "chunk", "batch_size": 64, "diarise_audio": false }
Input Parameters
- url
- Video URL for yt-dlp to download the audio from. Either this or audio must be provided.
- task
- Task to perform: transcribe or translate to another language. (default: transcribe).
- audio
- Audio file. Either this or url must be provided.
- hf_token
- Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.
- language
- Optional. Language spoken in the audio, specify None to perform language detection.
- timestamp
- Whisper supports both chunked as well as word level timestamps. (default: chunk).
- batch_size
- Number of parallel batches you want to compute. Reduce if you face OOMs. (default: 64).
- diarise_audio
- Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.
Output Schema
Output
Example Execution Logs
[youtube] Extracting URL: https://www.youtube.com/watch?v=2ua_v4BA3qM [youtube] 2ua_v4BA3qM: Downloading webpage [youtube] 2ua_v4BA3qM: Downloading ios player API JSON [youtube] 2ua_v4BA3qM: Downloading android player API JSON [youtube] 2ua_v4BA3qM: Downloading m3u8 information [info] 2ua_v4BA3qM: Downloading 1 format(s): 251 [download] Destination: 1f1c25825fc2491fb6315ec79f759ff4.webm [download] 0.1% of 1.46MiB at Unknown B/s ETA Unknown [download] 0.2% of 1.46MiB at Unknown B/s ETA Unknown [download] 0.5% of 1.46MiB at Unknown B/s ETA Unknown [download] 1.0% of 1.46MiB at 10.98MiB/s ETA 00:00 [download] 2.1% of 1.46MiB at 17.64MiB/s ETA 00:00 [download] 4.2% of 1.46MiB at 29.24MiB/s ETA 00:00 [download] 8.5% of 1.46MiB at 49.04MiB/s ETA 00:00 [download] 17.0% of 1.46MiB at 82.09MiB/s ETA 00:00 [download] 34.1% of 1.46MiB at 133.52MiB/s ETA 00:00 [download] 68.3% of 1.46MiB at 207.44MiB/s ETA 00:00 [download] 100.0% of 1.46MiB at 253.97MiB/s ETA 00:00 [download] 100% of 1.46MiB in 00:00:00 at 10.46MiB/s [ExtractAudio] Destination: 1f1c25825fc2491fb6315ec79f759ff4.mp3 Deleting original file 1f1c25825fc2491fb6315ec79f759ff4.webm (pass -k to keep) Downloaded audio from the video URL https://www.youtube.com/watch?v=2ua_v4BA3qM Voila!✨ Your file has been transcribed! Removing downloaded audio file max gpu memory allocated over runtime: 4.50 GB
Version Details
- Version ID
4f41e90243af171da918f04da3e526b2c247065583ea9b757f2071f573965408
- Version Created
- January 8, 2024