jigsawstack/speech-to-text 📝✓🔢 → 📝

▶️ 5 runs 📅 Jun 2025 ⚙️ Cog 0.15.8
speaker-diarization speech-to-text video-to-text

About

Convert speech to text with jigsawstack's api

Example Output

Output

{
"success": true,
"text": " The little tales they tell are false The door was barred, locked and bolted as well Ripe pears are fit for a queen's table A big wet stain was on the round carpet The kite dipped and swayed but stayed aloft The pleasant hours fly by much too soon The room was crowded with a mild wob The room was crowded with a wild mob This strong arm shall shield your honour She blushed when he gave her a white orchid The beetle droned in the hot June sun",
"chunks": [
{
"timestamp": [
0,
2.39
],
"text": " The little tales"
},
{
"timestamp": [
2.39,
4.78
],
"text": "they tell are false"
},
{
"timestamp": [
4.78,
7.130000000000001
],
"text": " The door was barred,"
},
{
"timestamp": [
7.130000000000001,
9.48
],
"text": "locked and bolted as well"
},
{
"timestamp": [
9.48,
11.27
],
"text": " Ripe pears are fit"
},
{
"timestamp": [
11.27,
13.06
],
"text": "for a queen's table"
},
{
"timestamp": [
13.06,
15.149999999999999
],
"text": " A big wet stain"
},
{
"timestamp": [
15.149999999999999,
17.24
],
"text": "was on the round carpet"
},
{
"timestamp": [
17.24,
19.509999999999998
],
"text": " The kite dipped and"
},
{
"timestamp": [
19.509999999999998,
21.78
],
"text": "swayed but stayed aloft"
},
{
"timestamp": [
21.78,
24.04
],
"text": " The pleasant hours fly"
},
{
"timestamp": [
24.04,
26.3
],
"text": "by much too soon"
},
{
"timestamp": [
26.3,
28.53
],
"text": " The room was crowded"
},
{
"timestamp": [
28.53,
30.76
],
"text": "with a mild wob"
},
{
"timestamp": [
30.76,
32.92
],
"text": " The room was crowded"
},
{
"timestamp": [
32.92,
35.08
],
"text": "with a wild mob"
},
{
"timestamp": [
35.08,
37.16
],
"text": " This strong arm"
},
{
"timestamp": [
37.16,
39.24
],
"text": "shall shield your honour"
},
{
"timestamp": [
39.24,
41.59
],
"text": " She blushed when he"
},
{
"timestamp": [
41.59,
43.94
],
"text": "gave her a white orchid"
},
{
"timestamp": [
43.94,
46.22
],
"text": " The beetle droned in"
},
{
"timestamp": [
46.22,
48.5
],
"text": "the hot June sun"
}
],
"_usage": {
"input_tokens": 15,
"output_tokens": 227,
"inference_time_tokens": 437,
"total_tokens": 679
}
}

Performance Metrics

1.02s Prediction Time
10.91s Total Time
All Input Parameters
{
  "url": "https://jigsawstack.com/preview/stt-example.wav",
  "api_key": "sk_a0e455baafafef8a0249681f14b8a9e39dfe264f3f0ecd36dbe002e4967c18848c752808a2b0fe473778a9a6ddec76782a533d10dc73a3fa5e10d9c2abed4abc024Eao3h0dmFQ20AsZ1Zb",
  "translate": false,
  "batch_size": 30,
  "by_speaker": false
}
Input Parameters
url Type: string
The video/audio URL. Not required if file_store_key is specified.
api_key Type: string
🔐 Your JigsawStack API Key (required)
language Type: string
The language to transcribe or translate the file into. If not specified, the model will automatically detect the language.
translate Type: booleanDefault: false
When set to true, translates the content into English (or the specified language if language parameter is provided).
batch_size Type: integerDefault: 30Range: 1 - 40
The batch size to return. Maximum value is 40. This controls how the audio is chunked for processing.
by_speaker Type: booleanDefault: false
Identifies and separates different speakers in the audio file.
webhook_url Type: string
Webhook URL to send result to. When provided, the API will process asynchronously.
file_store_key Type: string
The key used to store the video/audio file on Jigsawstack File Storage. Not required if url is specified.
Output Schema

Output

Type: string

Version Details
Version ID
b5c3816372374b87279128dfc7cf4d3ea7d542f41dfe2081d1a89f88ba8dd509
Version Created
June 30, 2025
Run on Replicate →