cjwbw/whisper 🖼️❓🔢✓📝 → ❓

▶️ 54.9K runs 📅 Dec 2022 ⚙️ Cog 0.6.1 🔗 GitHub
speech-to-text subtitle-generation

About

with large-v2 checkpoint

Example Output

Output

{"segments":[{"id":0,"end":5.44,"seek":0,"text":" Le centre culturel a organisé une table ronde.","start":0,"tokens":[1456,10093,3713,75,257,15223,526,2251,3199,367,7259,13],"avg_logprob":-0.38572823511411064,"temperature":0,"no_speech_prob":0.2616336941719055,"compression_ratio":1.3846153846153846},{"id":1,"end":10.24,"seek":0,"text":" Son univers est celui de sa famille.","start":5.44,"tokens":[5185,5950,871,22829,368,601,28123,13],"avg_logprob":-0.38572823511411064,"temperature":0,"no_speech_prob":0.2616336941719055,"compression_ratio":1.3846153846153846},{"id":2,"end":14.76,"seek":0,"text":" Le bateau lutte pour tenir face au vent.","start":10.24,"tokens":[1456,37936,1459,38319,975,2016,30593,1851,1609,6931,13],"avg_logprob":-0.38572823511411064,"temperature":0,"no_speech_prob":0.2616336941719055,"compression_ratio":1.3846153846153846},{"id":3,"end":20.6,"seek":0,"text":" Aucune demande ne sera prise en compte après cette date.","start":14.76,"tokens":[316,1311,2613,26982,408,15021,49468,465,19424,13274,5550,4002,13],"avg_logprob":-0.38572823511411064,"temperature":0,"no_speech_prob":0.2616336941719055,"compression_ratio":1.3846153846153846},{"id":4,"end":26.400000000000002,"seek":0,"text":" Rolon et Olivier sont d'une merveilleuse brahoure.","start":20.6,"tokens":[497,401,266,1030,48075,4900,274,6,2613,3551,303,3409,438,1548,71,44283,13],"avg_logprob":-0.38572823511411064,"temperature":0,"no_speech_prob":0.2616336941719055,"compression_ratio":1.3846153846153846},{"id":5,"end":31.4,"seek":2640,"text":" Connecis-vous cette expression?","start":26.4,"tokens":[2656,716,66,271,12,16514,5550,6114,2506],"avg_logprob":-0.4892590840657552,"temperature":0,"no_speech_prob":0.00015065229672472924,"compression_ratio":1.2810457516339868},{"id":6,"end":36.4,"seek":2640,"text":" Les acteurs se préparent à monter en scène.","start":31.4,"tokens":[6965,605,20395,369,11127,38321,1531,47945,465,42424,13],"avg_logprob":-0.4892590840657552,"temperature":0,"no_speech_prob":0.00015065229672472924,"compression_ratio":1.2810457516339868},{"id":7,"end":40.4,"seek":2640,"text":" Le pays et jeolier vio de la colline.","start":36.4,"tokens":[1456,10604,1030,1506,401,811,371,1004,368,635,1263,533,13],"avg_logprob":-0.4892590840657552,"temperature":0,"no_speech_prob":0.00015065229672472924,"compression_ratio":1.2810457516339868},{"id":8,"end":45.4,"seek":2640,"text":" Tu vas voir à mon vieux ce que je vais faire.","start":40.4,"tokens":[7836,11481,10695,1531,1108,4941,2449,1769,631,1506,9369,4865,13],"avg_logprob":-0.4892590840657552,"temperature":0,"no_speech_prob":0.00015065229672472924,"compression_ratio":1.2810457516339868},{"id":9,"end":51.4,"seek":2640,"text":" De fait, Jacques buvez beaucoup.","start":45.4,"tokens":[1346,3887,11,42691,758,10941,8796,13],"avg_logprob":-0.4892590840657552,"temperature":0,"no_speech_prob":0.00015065229672472924,"compression_ratio":1.2810457516339868},{"id":10,"end":57.4,"seek":5140,"text":" Tu n'y iras certaines mots pas.","start":51.4,"tokens":[7836,297,6,88,3418,296,36993,34009,1736,13],"avg_logprob":-0.3673367636544364,"temperature":0,"no_speech_prob":0.0005638044676743448,"compression_ratio":1.2875816993464053},{"id":11,"end":62.4,"seek":5140,"text":" Émanuelle s'ombe encore à voir vieille.","start":57.4,"tokens":[4922,1601,23635,262,6,298,650,10122,1531,10695,4941,3409,13],"avg_logprob":-0.3673367636544364,"temperature":0,"no_speech_prob":0.0005638044676743448,"compression_ratio":1.2875816993464053},{"id":12,"end":67.4,"seek":5140,"text":" Il a signé un contrat de 5 ans.","start":62.4,"tokens":[4416,257,1465,526,517,40944,368,1025,1567,13],"avg_logprob":-0.3673367636544364,"temperature":0,"no_speech_prob":0.0005638044676743448,"compression_ratio":1.2875816993464053},{"id":13,"end":72.4,"seek":5140,"text":" Des vegonnets de mine sont transformés en un brevoir.","start":67.4,"tokens":[3885,1241,10660,77,1385,368,3892,4900,4088,2191,465,517,1403,8823,13],"avg_logprob":-0.3673367636544364,"temperature":0,"no_speech_prob":0.0005638044676743448,"compression_ratio":1.2875816993464053},{"id":14,"end":77.4,"seek":5140,"text":" Il y a aller de leur crédibilité.","start":72.4,"tokens":[4416,288,257,8722,368,9580,37368,11607,5066,13],"avg_logprob":-0.3673367636544364,"temperature":0,"no_speech_prob":0.0005638044676743448,"compression_ratio":1.2875816993464053},{"id":15,"end":81.4,"seek":7740,"text":" Tous les yeux étaient braqués sur lui.","start":77.4,"tokens":[47277,1512,36163,25999,1548,358,2191,1022,8783,13],"avg_logprob":-0.16321171247042143,"temperature":0,"no_speech_prob":0.00009526827489025891,"compression_ratio":1.1226415094339623},{"id":16,"end":86.4,"seek":7740,"text":" Cette fois, la chance peut nous surrir.","start":81.4,"tokens":[25556,9576,11,635,2931,5977,4666,1022,10949,13],"avg_logprob":-0.16321171247042143,"temperature":0,"no_speech_prob":0.00009526827489025891,"compression_ratio":1.1226415094339623},{"id":17,"end":115.4,"seek":8640,"text":" Le bateau filait rapide et silencieux.","start":86.4,"tokens":[50364,1456,37936,1459,1387,1001,5099,482,1030,3425,36220,2449,13,51814],"avg_logprob":-0.11800411542256674,"temperature":0,"no_speech_prob":0.000017061091057257727,"compression_ratio":0.8260869565217391}],"translation":null,"transcription":" Le centre culturel a organisé une table ronde. Son univers est celui de sa famille. Le bateau lutte pour tenir face au vent. Aucune demande ne sera prise en compte après cette date. Rolon et Olivier sont d'une merveilleuse brahoure. Connecis-vous cette expression? Les acteurs se préparent à monter en scène. Le pays et jeolier vio de la colline. Tu vas voir à mon vieux ce que je vais faire. De fait, Jacques buvez beaucoup. Tu n'y iras certaines mots pas. Émanuelle s'ombe encore à voir vieille. Il a signé un contrat de 5 ans. Des vegonnets de mine sont transformés en un brevoir. Il y a aller de leur crédibilité. Tous les yeux étaient braqués sur lui. Cette fois, la chance peut nous surrir. Le bateau filait rapide et silencieux.","detected_language":"french"}

Performance Metrics

3.67s Prediction Time
131.72s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/HyA5vc7VTgMhDXKZJdHJOYsFZThUI0bxCGlTIJKm6TP871il/OSR_fr_000_0045_8k.wav",
  "model": "base",
  "transcription": "plain text",
  "suppress_tokens": "-1",
  "logprob_threshold": -1,
  "no_speech_threshold": 0.6,
  "condition_on_previous_text": true,
  "compression_ratio_threshold": 2.4,
  "temperature_increment_on_fallback": 0.2
}
Input Parameters
audio (required) Type: string
Audio file
model Default: base
Choose a Whisper model.
language
language spoken in the audio, specify None to perform language detection
patience Type: number
optional patience value to use in beam decoding, as in https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional beam search
translate Type: booleanDefault: false
Translate the text to English when set to True
temperature Type: numberDefault: 0
temperature to use for sampling
transcription Default: plain text
Choose the format for the transcription
initial_prompt Type: string
optional text to provide as a prompt for the first window.
suppress_tokens Type: stringDefault: -1
comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations
logprob_threshold Type: numberDefault: -1
if the average log probability is lower than this value, treat the decoding as failed
no_speech_threshold Type: numberDefault: 0.6
if the probability of the <|nospeech|> token is higher than this value AND the decoding has failed due to `logprob_threshold`, consider the segment as silence
condition_on_previous_text Type: booleanDefault: true
if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop
compression_ratio_threshold Type: numberDefault: 2.4
if the gzip compression ratio is higher than this value, treat the decoding as failed
temperature_increment_on_fallback Type: numberDefault: 0.2
temperature to increase when falling back when the decoding fails to meet either of the thresholds below
Output Schema

Output

Example Execution Logs
Transcribe with base model
Version Details
Version ID
b70a8e9dc4aa40bf4309285fbaefe3ed3d3a313f1f32ea61826fc64cdb4917a5
Version Created
December 16, 2022
Run on Replicate →