sabuhigr/sabuhi-model 🖼️❓📝🔢✓ → ❓

▶️ 25.5K runs 📅 Jun 2023 ⚙️ Cog 0.7.2
speaker-diarization speech-to-text

About

Whisper AI with channel separation and speaker diarization

Example Output

Output

{"segments":[{"end":3.98,"text":" تقع القاهرة على جوانب جزر نهر النيل في شمال مصر","start":0.06,"words":[{"end":0.422,"word":"تقع","score":0.884,"start":0.06},{"end":1.025,"word":"القاهرة","score":0.899,"start":0.442,"speaker":"SPEAKER_00"},{"end":1.306,"word":"على","score":0.874,"start":1.085,"speaker":"SPEAKER_00"},{"end":1.95,"word":"جوانب","score":0.963,"start":1.367,"speaker":"SPEAKER_00"},{"end":2.372,"word":"جزر","score":0.928,"start":2.01,"speaker":"SPEAKER_00"},{"end":2.713,"word":"نهر","score":0.991,"start":2.412,"speaker":"SPEAKER_00"},{"end":3.015,"word":"النيل","score":0.527,"start":2.734,"speaker":"SPEAKER_00"},{"end":3.196,"word":"في","score":0.999,"start":3.075,"speaker":"SPEAKER_00"},{"end":3.678,"word":"شمال","score":0.986,"start":3.276,"speaker":"SPEAKER_00"},{"end":3.98,"word":"مصر","score":0.534,"start":3.759,"speaker":"SPEAKER_00"}],"speaker":"SPEAKER_00"}],"translation":null,"transcription":"[{'start': 0.06, 'end': 3.98, 'text': ' تقع القاهرة على جوانب جزر نهر النيل في شمال مصر', 'words': [{'word': 'تقع', 'start': 0.06, 'end': 0.422, 'score': 0.884}, {'word': 'القاهرة', 'start': 0.442, 'end': 1.025, 'score': 0.899, 'speaker': 'SPEAKER_00'}, {'word': 'على', 'start': 1.085, 'end': 1.306, 'score': 0.874, 'speaker': 'SPEAKER_00'}, {'word': 'جوانب', 'start': 1.367, 'end': 1.95, 'score': 0.963, 'speaker': 'SPEAKER_00'}, {'word': 'جزر', 'start': 2.01, 'end': 2.372, 'score': 0.928, 'speaker': 'SPEAKER_00'}, {'word': 'نهر', 'start': 2.412, 'end': 2.713, 'score': 0.991, 'speaker': 'SPEAKER_00'}, {'word': 'النيل', 'start': 2.734, 'end': 3.015, 'score': 0.527, 'speaker': 'SPEAKER_00'}, {'word': 'في', 'start': 3.075, 'end': 3.196, 'score': 0.999, 'speaker': 'SPEAKER_00'}, {'word': 'شمال', 'start': 3.276, 'end': 3.678, 'score': 0.986, 'speaker': 'SPEAKER_00'}, {'word': 'مصر', 'start': 3.759, 'end': 3.98, 'score': 0.534, 'speaker': 'SPEAKER_00'}], 'speaker': 'SPEAKER_00'}]","detected_language":"arabic"}

Performance Metrics

56.17s Prediction Time
231.68s Total Time
All Input Parameters
{
  "audio": "https://replicate.delivery/pbxt/J2ZTuLIUqfnWLDBJ8LYX7SmtnSQzdx0RAxPXzh3aKhKMPQhx/test-arabic.wav",
  "model": "large-v2",
  "hf_token": "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",
  "language": "ar",
  "max_speakers": "2",
  "min_speakers": "2",
  "transcription": "plain text",
  "suppress_tokens": "-1",
  "logprob_threshold": -1,
  "no_speech_threshold": 0.6,
  "condition_on_previous_text": true,
  "compression_ratio_threshold": 2.4,
  "temperature_increment_on_fallback": 0.2
}
Input Parameters
audio (required) Type: string
Audio file
model Default: large-v2
Choose a Whisper model.
hf_token (required) Type: string
Your Hugging Face token for speaker diarization
language (required)
language spoken in the audio, specify None to perform language detection
patience Type: number
optional patience value to use in beam decoding, as in https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional beam search
translate Type: booleanDefault: false
Translate the text to English when set to True
temperature Type: numberDefault: 0
temperature to use for sampling
max_speakers Default: 1
Select 2 if record is stereo, 1 if is mono.Default is 1 for mono records
min_speakers Default: 1
Select 2 if record is stereo, 1 if is mono.Default is 1 for mono records
transcription Default: plain text
Choose the format for the transcription
initial_prompt Type: string
optional text to provide as a prompt for the first window.
suppress_tokens Type: stringDefault: -1
comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations
logprob_threshold Type: numberDefault: -1
if the average log probability is lower than this value, treat the decoding as failed
no_speech_threshold Type: numberDefault: 0.6
if the probability of the <|nospeech|> token is higher than this value AND the decoding has failed due to `logprob_threshold`, consider the segment as silence
condition_on_previous_text Type: booleanDefault: true
if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop
compression_ratio_threshold Type: numberDefault: 2.4
if the gzip compression ratio is higher than this value, treat the decoding as failed
temperature_increment_on_fallback Type: numberDefault: 0.2
temperature to increase when falling back when the decoding fails to meet either of the thresholds below
Output Schema

Output

Example Execution Logs
Transcribe with large-v2 model
  0%|          | 0/439 [00:00<?, ?frames/s]
100%|██████████| 439/439 [00:16<00:00, 25.94frames/s]
100%|██████████| 439/439 [00:16<00:00, 25.94frames/s]
Downloading (…)rocessor_config.json:   0%|          | 0.00/158 [00:00<?, ?B/s]
Downloading (…)rocessor_config.json: 100%|██████████| 158/158 [00:00<00:00, 49.9kB/s]
Downloading (…)lve/main/config.json: 0.00B [00:00, ?B/s]
Downloading (…)lve/main/config.json: 1.56kB [00:00, 2.80MB/s]
Downloading (…)olve/main/vocab.json:   0%|          | 0.00/507 [00:00<?, ?B/s]
Downloading (…)olve/main/vocab.json: 100%|██████████| 507/507 [00:00<00:00, 268kB/s]
Downloading (…)cial_tokens_map.json:   0%|          | 0.00/85.0 [00:00<?, ?B/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 85.0/85.0 [00:00<00:00, 47.2kB/s]
Downloading pytorch_model.bin:   0%|          | 0.00/1.26G [00:00<?, ?B/s]
Downloading pytorch_model.bin:   1%|          | 10.5M/1.26G [00:00<00:22, 55.3MB/s]
Downloading pytorch_model.bin:   2%|▏         | 21.0M/1.26G [00:00<00:19, 64.5MB/s]
Downloading pytorch_model.bin:   2%|▏         | 31.5M/1.26G [00:00<00:18, 68.0MB/s]
Downloading pytorch_model.bin:   3%|▎         | 41.9M/1.26G [00:00<00:28, 43.0MB/s]
Downloading pytorch_model.bin:   5%|▍         | 62.9M/1.26G [00:01<00:21, 56.0MB/s]
Downloading pytorch_model.bin:   7%|▋         | 83.9M/1.26G [00:01<00:18, 63.4MB/s]
Downloading pytorch_model.bin:   7%|▋         | 94.4M/1.26G [00:01<00:18, 64.7MB/s]
Downloading pytorch_model.bin:   8%|▊         | 105M/1.26G [00:01<00:22, 50.4MB/s] 
Downloading pytorch_model.bin:   9%|▉         | 115M/1.26G [00:02<00:25, 44.4MB/s]
Downloading pytorch_model.bin:  11%|█         | 136M/1.26G [00:02<00:18, 60.8MB/s]
Downloading pytorch_model.bin:  12%|█▏        | 157M/1.26G [00:02<00:15, 73.4MB/s]
Downloading pytorch_model.bin:  14%|█▍        | 178M/1.26G [00:02<00:13, 80.9MB/s]
Downloading pytorch_model.bin:  15%|█▍        | 189M/1.26G [00:03<00:22, 48.6MB/s]
Downloading pytorch_model.bin:  16%|█▌        | 199M/1.26G [00:03<00:20, 53.0MB/s]
Downloading pytorch_model.bin:  17%|█▋        | 210M/1.26G [00:03<00:21, 48.4MB/s]
Downloading pytorch_model.bin:  18%|█▊        | 231M/1.26G [00:04<00:18, 56.4MB/s]
Downloading pytorch_model.bin:  20%|█▉        | 252M/1.26G [00:04<00:16, 60.9MB/s]
Downloading pytorch_model.bin:  22%|██▏       | 273M/1.26G [00:04<00:17, 55.3MB/s]
Downloading pytorch_model.bin:  22%|██▏       | 283M/1.26G [00:05<00:19, 49.1MB/s]
Downloading pytorch_model.bin:  24%|██▍       | 304M/1.26G [00:05<00:18, 52.2MB/s]
Downloading pytorch_model.bin:  26%|██▌       | 325M/1.26G [00:05<00:15, 60.6MB/s]
Downloading pytorch_model.bin:  27%|██▋       | 336M/1.26G [00:05<00:14, 62.4MB/s]
Downloading pytorch_model.bin:  27%|██▋       | 346M/1.26G [00:06<00:16, 57.0MB/s]
Downloading pytorch_model.bin:  29%|██▉       | 367M/1.26G [00:06<00:19, 45.8MB/s]
Downloading pytorch_model.bin:  30%|██▉       | 377M/1.26G [00:06<00:20, 43.9MB/s]
Downloading pytorch_model.bin:  31%|███       | 388M/1.26G [00:07<00:19, 45.5MB/s]
Downloading pytorch_model.bin:  32%|███▏      | 398M/1.26G [00:07<00:22, 38.3MB/s]
Downloading pytorch_model.bin:  32%|███▏      | 409M/1.26G [00:07<00:18, 45.2MB/s]
Downloading pytorch_model.bin:  33%|███▎      | 419M/1.26G [00:07<00:17, 47.6MB/s]
Downloading pytorch_model.bin:  35%|███▍      | 440M/1.26G [00:08<00:15, 52.2MB/s]
Downloading pytorch_model.bin:  36%|███▌      | 451M/1.26G [00:08<00:14, 55.4MB/s]
Downloading pytorch_model.bin:  37%|███▋      | 461M/1.26G [00:08<00:20, 39.1MB/s]
Downloading pytorch_model.bin:  37%|███▋      | 472M/1.26G [00:09<00:20, 38.7MB/s]
Downloading pytorch_model.bin:  38%|███▊      | 482M/1.26G [00:09<00:18, 41.4MB/s]
Downloading pytorch_model.bin:  39%|███▉      | 493M/1.26G [00:09<00:20, 37.3MB/s]
Downloading pytorch_model.bin:  41%|████      | 514M/1.26G [00:09<00:14, 52.0MB/s]
Downloading pytorch_model.bin:  42%|████▏     | 535M/1.26G [00:10<00:13, 52.2MB/s]
Downloading pytorch_model.bin:  44%|████▍     | 556M/1.26G [00:10<00:12, 55.4MB/s]
Downloading pytorch_model.bin:  45%|████▍     | 566M/1.26G [00:10<00:14, 49.6MB/s]
Downloading pytorch_model.bin:  47%|████▋     | 587M/1.26G [00:11<00:11, 59.6MB/s]
Downloading pytorch_model.bin:  48%|████▊     | 608M/1.26G [00:11<00:11, 56.8MB/s]
Downloading pytorch_model.bin:  49%|████▉     | 619M/1.26G [00:11<00:11, 55.3MB/s]
Downloading pytorch_model.bin:  50%|████▉     | 629M/1.26G [00:12<00:12, 49.6MB/s]
Downloading pytorch_model.bin:  52%|█████▏    | 650M/1.26G [00:12<00:11, 54.0MB/s]
Downloading pytorch_model.bin:  52%|█████▏    | 661M/1.26G [00:12<00:11, 50.5MB/s]
Downloading pytorch_model.bin:  53%|█████▎    | 671M/1.26G [00:12<00:10, 53.9MB/s]
Downloading pytorch_model.bin:  54%|█████▍    | 682M/1.26G [00:13<00:13, 42.6MB/s]
Downloading pytorch_model.bin:  55%|█████▍    | 692M/1.26G [00:13<00:11, 49.0MB/s]
Downloading pytorch_model.bin:  56%|█████▌    | 703M/1.26G [00:13<00:11, 49.7MB/s]
Downloading pytorch_model.bin:  57%|█████▋    | 724M/1.26G [00:13<00:08, 62.0MB/s]
Downloading pytorch_model.bin:  59%|█████▉    | 744M/1.26G [00:14<00:10, 49.9MB/s]
Downloading pytorch_model.bin:  60%|█████▉    | 755M/1.26G [00:14<00:10, 49.9MB/s]
Downloading pytorch_model.bin:  61%|██████▏   | 776M/1.26G [00:14<00:08, 58.3MB/s]
Downloading pytorch_model.bin:  62%|██████▏   | 786M/1.26G [00:15<00:08, 54.3MB/s]
Downloading pytorch_model.bin:  63%|██████▎   | 797M/1.26G [00:15<00:11, 41.2MB/s]
Downloading pytorch_model.bin:  64%|██████▍   | 807M/1.26G [00:15<00:10, 42.4MB/s]
Downloading pytorch_model.bin:  65%|██████▍   | 818M/1.26G [00:16<00:13, 33.7MB/s]
Downloading pytorch_model.bin:  66%|██████▋   | 839M/1.26G [00:16<00:10, 41.1MB/s]
Downloading pytorch_model.bin:  67%|██████▋   | 849M/1.26G [00:16<00:10, 40.7MB/s]
Downloading pytorch_model.bin:  68%|██████▊   | 860M/1.26G [00:17<00:10, 39.5MB/s]
Downloading pytorch_model.bin:  69%|██████▉   | 870M/1.26G [00:17<00:09, 39.7MB/s]
Downloading pytorch_model.bin:  71%|███████   | 891M/1.26G [00:17<00:07, 51.5MB/s]
Downloading pytorch_model.bin:  72%|███████▏  | 912M/1.26G [00:18<00:08, 43.1MB/s]
Downloading pytorch_model.bin:  73%|███████▎  | 923M/1.26G [00:18<00:07, 44.5MB/s]
Downloading pytorch_model.bin:  74%|███████▍  | 933M/1.26G [00:18<00:07, 41.7MB/s]
Downloading pytorch_model.bin:  75%|███████▍  | 944M/1.26G [00:19<00:08, 39.5MB/s]
Downloading pytorch_model.bin:  76%|███████▌  | 954M/1.26G [00:19<00:06, 45.4MB/s]
Downloading pytorch_model.bin:  76%|███████▋  | 965M/1.26G [00:19<00:06, 45.8MB/s]
Downloading pytorch_model.bin:  78%|███████▊  | 986M/1.26G [00:19<00:05, 48.4MB/s]
Downloading pytorch_model.bin:  79%|███████▉  | 996M/1.26G [00:20<00:06, 44.1MB/s]
Downloading pytorch_model.bin:  80%|███████▉  | 1.01G/1.26G [00:20<00:06, 40.3MB/s]
Downloading pytorch_model.bin:  81%|████████▏ | 1.03G/1.26G [00:20<00:04, 49.5MB/s]
Downloading pytorch_model.bin:  82%|████████▏ | 1.04G/1.26G [00:21<00:04, 46.9MB/s]
Downloading pytorch_model.bin:  84%|████████▍ | 1.06G/1.26G [00:21<00:03, 54.2MB/s]
Downloading pytorch_model.bin:  86%|████████▌ | 1.08G/1.26G [00:21<00:03, 55.6MB/s]
Downloading pytorch_model.bin:  87%|████████▋ | 1.10G/1.26G [00:21<00:02, 65.0MB/s]
Downloading pytorch_model.bin:  89%|████████▉ | 1.12G/1.26G [00:22<00:02, 50.5MB/s]
Downloading pytorch_model.bin:  90%|████████▉ | 1.13G/1.26G [00:22<00:03, 40.7MB/s]
Downloading pytorch_model.bin:  91%|█████████ | 1.14G/1.26G [00:23<00:02, 43.8MB/s]
Downloading pytorch_model.bin:  91%|█████████▏| 1.15G/1.26G [00:23<00:02, 43.3MB/s]
Downloading pytorch_model.bin:  93%|█████████▎| 1.17G/1.26G [00:23<00:02, 42.7MB/s]
Downloading pytorch_model.bin:  95%|█████████▍| 1.20G/1.26G [00:24<00:01, 42.3MB/s]
Downloading pytorch_model.bin:  96%|█████████▋| 1.22G/1.26G [00:24<00:00, 51.3MB/s]
Downloading pytorch_model.bin:  97%|█████████▋| 1.23G/1.26G [00:24<00:00, 48.4MB/s]
Downloading pytorch_model.bin:  99%|█████████▉| 1.25G/1.26G [00:25<00:00, 51.9MB/s]
Downloading pytorch_model.bin: 100%|██████████| 1.26G/1.26G [00:25<00:00, 49.8MB/s]
Downloading (…)olve/2.1/config.yaml:   0%|          | 0.00/500 [00:00<?, ?B/s]
Downloading (…)olve/2.1/config.yaml: 100%|██████████| 500/500 [00:00<00:00, 67.2kB/s]
Downloading pytorch_model.bin:   0%|          | 0.00/17.7M [00:00<?, ?B/s]
Downloading pytorch_model.bin:  59%|█████▉    | 10.5M/17.7M [00:00<00:00, 39.2MB/s]
Downloading pytorch_model.bin: 100%|██████████| 17.7M/17.7M [00:00<00:00, 56.2MB/s]
Downloading (…)/2022.07/config.yaml:   0%|          | 0.00/318 [00:00<?, ?B/s]
Downloading (…)/2022.07/config.yaml: 100%|██████████| 318/318 [00:00<00:00, 164kB/s]
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.3. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file ../root/.cache/torch/pyannote/models--pyannote--segmentation/snapshots/c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b/pytorch_model.bin`
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.0+cu117. Bad things might happen unless you revert torch to 1.x.
Downloading (…)ain/hyperparams.yaml: 0.00B [00:00, ?B/s]
Downloading (…)ain/hyperparams.yaml: 1.92kB [00:00, 3.93MB/s]
Downloading embedding_model.ckpt:   0%|          | 0.00/83.3M [00:00<?, ?B/s]
Downloading embedding_model.ckpt:  38%|███▊      | 31.5M/83.3M [00:00<00:00, 226MB/s]
Downloading embedding_model.ckpt:  76%|███████▌  | 62.9M/83.3M [00:00<00:00, 215MB/s]
Downloading embedding_model.ckpt: 100%|██████████| 83.3M/83.3M [00:00<00:00, 209MB/s]
Downloading (…)an_var_norm_emb.ckpt:   0%|          | 0.00/1.92k [00:00<?, ?B/s]
Downloading (…)an_var_norm_emb.ckpt: 100%|██████████| 1.92k/1.92k [00:00<00:00, 1.03MB/s]
Downloading classifier.ckpt:   0%|          | 0.00/5.53M [00:00<?, ?B/s]
Downloading classifier.ckpt: 100%|██████████| 5.53M/5.53M [00:00<00:00, 179MB/s]
Downloading (…)in/label_encoder.txt: 0.00B [00:00, ?B/s]
Downloading (…)in/label_encoder.txt: 129kB [00:00, 66.6MB/s]
0  1  ... intersection   union
0  [ 00:00:00.497 -->  00:00:04.345]  A  ...        0.221  3.8475
[1 rows x 7 columns]
[{'start': 0.06, 'end': 3.98, 'text': ' تقع القاهرة على جوانب جزر نهر النيل في شمال مصر', 'words': [{'word': 'تقع', 'start': 0.06, 'end': 0.422, 'score': 0.884}, {'word': 'القاهرة', 'start': 0.442, 'end': 1.025, 'score': 0.899, 'speaker': 'SPEAKER_00'}, {'word': 'على', 'start': 1.085, 'end': 1.306, 'score': 0.874, 'speaker': 'SPEAKER_00'}, {'word': 'جوانب', 'start': 1.367, 'end': 1.95, 'score': 0.963, 'speaker': 'SPEAKER_00'}, {'word': 'جزر', 'start': 2.01, 'end': 2.372, 'score': 0.928, 'speaker': 'SPEAKER_00'}, {'word': 'نهر', 'start': 2.412, 'end': 2.713, 'score': 0.991, 'speaker': 'SPEAKER_00'}, {'word': 'النيل', 'start': 2.734, 'end': 3.015, 'score': 0.527, 'speaker': 'SPEAKER_00'}, {'word': 'في', 'start': 3.075, 'end': 3.196, 'score': 0.999, 'speaker': 'SPEAKER_00'}, {'word': 'شمال', 'start': 3.276, 'end': 3.678, 'score': 0.986, 'speaker': 'SPEAKER_00'}, {'word': 'مصر', 'start': 3.759, 'end': 3.98, 'score': 0.534, 'speaker': 'SPEAKER_00'}], 'speaker': 'SPEAKER_00'}]
Version Details
Version ID
29b6421db707f07d23fbb646c260b1e2754cbf7ccaf081435330a0c2a3544b74
Version Created
August 3, 2023
Run on Replicate →