cjwbw/voicecraft 🔢❓🖼️📝 → ❓

▶️ 10.6K runs 📅 Apr 2024 ⚙️ Cog 0.13.7 🔗 GitHub 📄 Paper ⚖️ License
audio-to-audio text-to-speech voice-cloning

About

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Example Output

Output

{"generated_audio":"https://replicate.delivery/pbxt/YvDHf5LaFg2FKCZieFteAxnHeKYSdfMRjbeeIBFQYgfjlM0sSA/out.wav","whisper_transcript_orig_audio":"But when I had approached so near to them, the common object, which the sense deceives, lost not by distance any of its marks."}

Performance Metrics

7.26s Prediction Time
7.30s Total Time
All Input Parameters
{
  "task": "zero-shot text-to-speech",
  "top_p": 0.8,
  "kvcache": 1,
  "orig_audio": "https://replicate.delivery/pbxt/Kh3PJuzs2xNgaaNOU6fD3jTz0Xx2dE1zpdXpT2k19fzsB8qE/84_121550_000074_000000.wav",
  "cut_off_sec": 3.01,
  "left_margin": 0.08,
  "temperature": 1,
  "right_margin": 0.08,
  "whisperx_model": "base.en",
  "orig_transcript": "",
  "stop_repetition": 3,
  "voicecraft_model": "giga330M_TTSEnhanced.pth",
  "sample_batch_size": 4,
  "target_transcript": "I cannot believe that the same model can also do text to speech synthesis too!"
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed
task Default: zero-shot text-to-speech
Choose a task
top_p Type: numberDefault: 0.9
Default value for TTS is 0.9, and 0.8 for speech editing
kvcache Default: 1
Set to 0 to use less VRAM, but with slower inference
orig_audio (required) Type: string
Original audio file
cut_off_sec Type: numberDefault: 3.01
Only used for for zero-shot text-to-speech task. The first seconds of the original audio that are used for zero-shot text-to-speech. 3 sec of reference is generally enough for high quality voice cloning, but longer is generally better, try e.g. 3~6 sec
left_margin Type: numberDefault: 0.08
Margin to the left of the editing segment
temperature Type: numberDefault: 1
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic. Do not recommend to change
right_margin Type: numberDefault: 0.08
Margin to the right of the editing segment
whisperx_model Default: base.en
If orig_transcript is not provided above, choose a WhisperX model for generating the transcript. Inaccurate transcription may lead to error TTS or speech editing. You can modify the generated transcript and provide it directly to orig_transcript above
orig_transcript Type: stringDefault:
Optionally provide the transcript of the input audio. Leave it blank to use the WhisperX model below to generate the transcript. Inaccurate transcription may lead to error TTS or speech editing
stop_repetition Type: integerDefault: 3
Default value for TTS is 3, and -1 for speech editing. -1 means do not adjust prob of silence tokens. if there are long silence or unnaturally stretched words, increase sample_batch_size to 2, 3 or even 4
voicecraft_model Default: giga330M_TTSEnhanced.pth
Choose a model
sample_batch_size Type: integerDefault: 4
Default value for TTS is 4, and 1 for speech editing. The higher the number, the faster the output will be. Under the hood, the model will generate this many samples and choose the shortest one
target_transcript (required) Type: string
Transcript of the target audio file
Output Schema
Example Execution Logs
Using seed: 34156
Suppressing numeral and symbol tokens: [3, 4, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 352, 362, 405, 486, 513, 580, 604, 642, 657, 678, 718, 720, 767, 807, 830, 838, 860, 939, 940, 1065, 1105, 1120, 1129, 1157, 1160, 1238, 1248, 1264, 1270, 1314, 1315, 1367, 1415, 1433, 1467, 1478, 1485, 1495, 1507, 1511, 1542, 1558, 1584, 1594, 1596, 1679, 1731, 1795, 1802, 1821, 1828, 1853, 1899, 1946, 1954, 1959, 1983, 1987, 2026, 2075, 2078, 2079, 2091, 2154, 2167, 2177, 2211, 2231, 2242, 2310, 2319, 2321, 2327, 2388, 2414, 2425, 2481, 2534, 2548, 2579, 2598, 2608, 2623, 2624, 2670, 2681, 2682, 2713, 2718, 2757, 2780, 2791, 2808, 2813, 2816, 2857, 2864, 2919, 2920, 2931, 2996, 2998, 2999, 3023, 3050, 3064, 3070, 3104, 3126, 3132, 3134, 3261, 3270, 3312, 3324, 3365, 3388, 3439, 3459, 3510, 3553, 3559, 3571, 3648, 3682, 3695, 3717, 3720, 3829, 3865, 3901, 3933, 3980, 4019, 4051, 4059, 4064, 4089, 4101, 4153, 4248, 4304, 4309, 4310, 4317, 4343, 4349, 4353, 4407, 4521, 4524, 4531, 4570, 4626, 4747, 4751, 4761, 4764, 4790, 4793, 4846, 4869, 4967, 4974, 5014, 5066, 5075, 5125, 5214, 5237, 5304, 5323, 5332, 5333, 5433, 5441, 5472, 5534, 5539, 5598, 5607, 5705, 5774, 5816, 5824, 5846, 5867, 5878, 5892, 5946, 5996, 5999, 6052, 6073, 6135, 6200, 6244, 6298, 6303, 6337, 6390, 6420, 6469, 6640, 6659, 6740, 6885, 6957, 6999, 7029, 7169, 7175, 7192, 7198, 7225, 7265, 7337, 7358, 7388, 7410, 7441, 7600, 7618, 7632, 7643, 7724, 7769, 7795, 7816, 7863, 7908, 7930, 7982, 8054, 8069, 8093, 8190, 8235, 8257, 8269, 8275, 8298, 8309, 8454, 8487, 8541, 8576, 8628, 8644, 8646, 8684, 8699, 8702, 8735, 8753, 8784, 8854, 8870, 8915, 8949, 9031, 9130, 9162, 9166, 9193, 9225, 9415, 9507, 9508, 9656, 9661, 9698, 9768, 9773, 9796, 9804, 9849, 9879, 9907, 9919, 10048, 10053, 10083, 10111, 10163, 10190, 10232, 10249, 10261, 10333, 10460, 10495, 10531, 10535, 11024, 11104, 11245, 11323, 11442, 11445, 11470, 11509, 11528, 11546, 11623, 11645, 11785, 12113, 12122, 12131, 12279, 12713, 12726, 12762, 12825, 12844, 12863, 12865, 12877, 12923, 12952, 13037, 13108, 13130, 13151, 13330, 13343, 13348, 13374, 13381, 13454, 13464, 13521, 13539, 13540, 13702, 13803, 14062, 14198, 14280, 14315, 14436, 14454, 14489, 14585, 14656, 14686, 14745, 14877, 14956, 14988, 15136, 15143, 15187, 15197, 15231, 15259, 15277, 15349, 15363, 15377, 15408, 15426, 15495, 15524, 15533, 15589, 15629, 15674, 15696, 15711, 15724, 15761, 15801, 15897, 15904, 15920, 15963, 15982, 16003, 16088, 16101, 16102, 16226, 16236, 16243, 16315, 16382, 16450, 16489, 16562, 16616, 16626, 16677, 16679, 16763, 16799, 16817, 16942, 16945, 16994, 17031, 17032, 17059, 17279, 17318, 17342, 17430, 17464, 17477, 17501, 17544, 17572, 17575, 17643, 17657, 17672, 17729, 17759, 17817, 17827, 17885, 17971, 18005, 18112, 18182, 18294, 18298, 18376, 18395, 18444, 18458, 18500, 18523, 18638, 18693, 18741, 18742, 18781, 18823, 18897, 18938, 18946, 19004, 19035, 19038, 19048, 19060, 19104, 19214, 19244, 19322, 19342, 19409, 19420, 19442, 19504, 19683, 19707, 19708, 19710, 19755, 19782, 19880, 19884, 19891, 19924, 20007, 20033, 20064, 20107, 20167, 20198, 20219, 20224, 20233, 20248, 20299, 20343, 20356, 20370, 20416, 20479, 20483, 20510, 20548, 20666, 20708, 20809, 20943, 20959, 20964, 20986, 21033, 21056, 21113, 21139, 21148, 21261, 21268, 21273, 21288, 21315, 21355, 21395, 21409, 21431, 21489, 21495, 21498, 21503, 21526, 21536, 21577, 21599, 21601, 21626, 21643, 21652, 21709, 21719, 21734, 21738, 21761, 21777, 21794, 21844, 21895, 21908, 21940, 22042, 22047, 22131, 22136, 22148, 22169, 22172, 22186, 22219, 22243, 22288, 22291, 22318, 22337, 22352, 22370, 22413, 22416, 22458, 22515, 22538, 22544, 22567, 22572, 22579, 22613, 22626, 22666, 22709, 22717, 22730, 22745, 22799, 22800, 22842, 22855, 22883, 22909, 22913, 22914, 22951, 22980, 22985, 22986, 22995, 22996, 23045, 23055, 23068, 23120, 23134, 23148, 23188, 23195, 23234, 23237, 23313, 23336, 23344, 23349, 23362, 23378, 23451, 23460, 23487, 23516, 23539, 23601, 23628, 23664, 23666, 23679, 23721, 23726, 23734, 23753, 23756, 23815, 23847, 23859, 23871, 23906, 23924, 24038, 24041, 24045, 24063, 24096, 24136, 24137, 24168, 24214, 24217, 24235, 24294, 24309, 24339, 24356, 24369, 24403, 24409, 24414, 24465, 24529, 24555, 24591, 24598, 24648, 24652, 24669, 24693, 24718, 24760, 24793, 24839, 24840, 24848, 24894, 24909, 24938, 24940, 24943, 24970, 24977, 24991, 25022, 25054, 25061, 25090, 25096, 25150, 25177, 25181, 25190, 25191, 25240, 25257, 25264, 25270, 25272, 25307, 25325, 25326, 25399, 25429, 25475, 25500, 25508, 25540, 25597, 25600, 25643, 25644, 25645, 25667, 25674, 25707, 25710, 25764, 25816, 25829, 25836, 25838, 25859, 25870, 25948, 25964, 26007, 26050, 26063, 26073, 26115, 26118, 26143, 26200, 26250, 26259, 26276, 26279, 26352, 26422, 26427, 26429, 26481, 26492, 26514, 26525, 26539, 26561, 26582, 26598, 26607, 26660, 26704, 26709, 26717, 26753, 26780, 26826, 26833, 26881, 26895, 26912, 26937, 26956, 27019, 27033, 27037, 27057, 27121, 27137, 27191, 27192, 27203, 27211, 27228, 27230, 27253, 27260, 27277, 27301, 27310, 27326, 27367, 27368, 27371, 27408, 27412, 27550, 27551, 27559, 27621, 27641, 27649, 27653, 27693, 27696, 27712, 27720, 27728, 27778, 27790, 27791, 27795, 27800, 27824, 27829, 27877, 27936, 27937, 27956, 27970, 27988, 28011, 28017, 28041, 28054, 28072, 28119, 28174, 28256, 28262, 28277, 28296, 28324, 28362, 28369, 28460, 28481, 28551, 28555, 28560, 28567, 28581, 28592, 28598, 28645, 28658, 28669, 28676, 28684, 28687, 28688, 28694, 28714, 28727, 28771, 28815, 28817, 28857, 28872, 28878, 28896, 28933, 28947, 28977, 28978, 29022, 29041, 29059, 29088, 29101, 29110, 29119, 29143, 29159, 29173, 29211, 29217, 29228, 29279, 29300, 29326, 29331, 29334, 29414, 29416, 29524, 29558, 29568, 29626, 29637, 29691, 29703, 29769, 29796, 29807, 29903, 29953, 30005, 30057, 30110, 30120, 30123, 30179, 30206, 30272, 30273, 30290, 30299, 30336, 30368, 30435, 30453, 30460, 30483, 30484, 30505, 30557, 30607, 30610, 30695, 30704, 30727, 30743, 30763, 30803, 30863, 30924, 30986, 30989, 30992, 30995, 31009, 31010, 31011, 31020, 31027, 31046, 31064, 31102, 31115, 31128, 31211, 31360, 31380, 31418, 31495, 31496, 31503, 31510, 31552, 31566, 31575, 31654, 31672, 31675, 31697, 31714, 31751, 31773, 31794, 31883, 31911, 31916, 31938, 31952, 31953, 31980, 31982, 32047, 32056, 32059, 32062, 32066, 32090, 32114, 32118, 32128, 32148, 32158, 32182, 32190, 32196, 32215, 32216, 32220, 32320, 32321, 32382, 32417, 32459, 32471, 32531, 32544, 32568, 32576, 32583, 32591, 32614, 32624, 32637, 32642, 32647, 32747, 32759, 32811, 32817, 32869, 32883, 32921, 32996, 33015, 33028, 33032, 33042, 33057, 33121, 33160, 33206, 33289, 33300, 33319, 33372, 33394, 33400, 33438, 33448, 33459, 33470, 33507, 33535, 33548, 33551, 33580, 33581, 33618, 33638, 33646, 33660, 33690, 33698, 33759, 33781, 33797, 33808, 33879, 33882, 33916, 33942, 33963, 33981, 34044, 34085, 34091, 34107, 34125, 34131, 34135, 34137, 34155, 34159, 34206, 34215, 34229, 34251, 34256, 34287, 34294, 34323, 34353, 34385, 34427, 34463, 34465, 34483, 34489, 34583, 34598, 34620, 34625, 34626, 34716, 34741, 34770, 34772, 34801, 34808, 34825, 34865, 34938, 34951, 35005, 35038, 35090, 35124, 35126, 35133, 35145, 35148, 35150, 35175, 35195, 35218, 35264, 35273, 35307, 35360, 35369, 35378, 35402, 35404, 35411, 35419, 35435, 35447, 35500, 35534, 35549, 35592, 35617, 35638, 35642, 35665, 35667, 35745, 35768, 35809, 35844, 35890, 35897, 35916, 35978, 35989, 36006, 36042, 36058, 36088, 36094, 36100, 36117, 36141, 36150, 36189, 36203, 36243, 36244, 36260, 36330, 36445, 36453, 36490, 36521, 36561, 36565, 36566, 36625, 36626, 36629, 36641, 36657, 36676, 36678, 36680, 36720, 36737, 36809, 36864, 36879, 36917, 36928, 36959, 36966, 36993, 37128, 37144, 37166, 37187, 37224, 37255, 37272, 37283, 37290, 37309, 37364, 37381, 37397, 37452, 37466, 37517, 37528, 37547, 37563, 37576, 37601, 37633, 37637, 37667, 37674, 37680, 37688, 37710, 37730, 37737, 37747, 37750, 37781, 37804, 37831, 37841, 37856, 37864, 37950, 37967, 37988, 38055, 38056, 38073, 38089, 38107, 38108, 38123, 38147, 38158, 38172, 38190, 38205, 38210, 38219, 38249, 38314, 38326, 38339, 38369, 38380, 38384, 38391, 38431, 38446, 38449, 38472, 38503, 38525, 38547, 38549, 38565, 38569, 38595, 38605, 38612, 38634, 38652, 38703, 38721, 38783, 38819, 38831, 38850, 38867, 38892, 38902, 38905, 38907, 38956, 39064, 39084, 39088, 39093, 39101, 39103, 39111, 39118, 39121, 39132, 39135, 39166, 39174, 39188, 39195, 39226, 39251, 39254, 39260, 39277, 39280, 39320, 39322, 39357, 39380, 39449, 39466, 39506, 39509, 39570, 39595, 39647, 39658, 39667, 39697, 39710, 39761, 39768, 39850, 39861, 39882, 39885, 39923, 39925, 39937, 39997, 40022, 40035, 40064, 40090, 40111, 40149, 40173, 40179, 40215, 40220, 40248, 40256, 40271, 40286, 40350, 40353, 40384, 40385, 40393, 40400, 40401, 40403, 40417, 40427, 40454, 40463, 40486, 40523, 40538, 40554, 40585, 40639, 40643, 40652, 40654, 40660, 40675, 40736, 40761, 40828, 40839, 40873, 40884, 41019, 41023, 41060, 41103, 41172, 41208, 41234, 41235, 41241, 41247, 41263, 41287, 41289, 41290, 41292, 41322, 41417, 41423, 41435, 41507, 41531, 41544, 41561, 41569, 41580, 41583, 41612, 41625, 41647, 41655, 41706, 41717, 41734, 41739, 41761, 41810, 41813, 41820, 41853, 41874, 41879, 41922, 41931, 41934, 41948, 41977, 42018, 42032, 42060, 42117, 42141, 42163, 42199, 42215, 42224, 42240, 42246, 42250, 42294, 42313, 42321, 42334, 42363, 42444, 42479, 42489, 42520, 42534, 42548, 42622, 42671, 42691, 42716, 42751, 42752, 42759, 42780, 42802, 42819, 42830, 42875, 42877, 42947, 42980, 43019, 43134, 43147, 43155, 43184, 43193, 43234, 43239, 43240, 43284, 43292, 43313, 43336, 43356, 43364, 43367, 43379, 43434, 43452, 43489, 43509, 43526, 43550, 43564, 43571, 43587, 43610, 43637, 43641, 43665, 43686, 43690, 43697, 43704, 43722, 43785, 43798, 43864, 43916, 43918, 43927, 43950, 43977, 44063, 44084, 44085, 44087, 44093, 44103, 44169, 44183, 44214, 44215, 44218, 44227, 44230, 44300, 44318, 44341, 44361, 44367, 44417, 44427, 44465, 44468, 44505, 44541, 44550, 44552, 44578, 44586, 44613, 44617, 44622, 44626, 44673, 44675, 44688, 44698, 44717, 44729, 44750, 44808, 44821, 44826, 44856, 44928, 44966, 44969, 44980, 44994, 45021, 45039, 45063, 45068, 45095, 45151, 45191, 45192, 45210, 45214, 45271, 45278, 45310, 45326, 45331, 45345, 45385, 45403, 45418, 45432, 45438, 45439, 45440, 45455, 45469, 45473, 45491, 45598, 45600, 45601, 45611, 45620, 45719, 45720, 45722, 45734, 45758, 45791, 45839, 45881, 45900, 45937, 45959, 45969, 45987, 46044, 46096, 46239, 46244, 46250, 46302, 46351, 46352, 46393, 46396, 46425, 46435, 46438, 46477, 46519, 46556, 46572, 46588, 46589, 46618, 46633, 46636, 46660, 46712, 46720, 46723, 46752, 46761, 46815, 46821, 46839, 46841, 46871, 46872, 46899, 46900, 46951, 46957, 47007, 47072, 47101, 47106, 47113, 47159, 47197, 47202, 47233, 47235, 47325, 47338, 47343, 47372, 47396, 47407, 47448, 47465, 47493, 47512, 47521, 47567, 47576, 47580, 47582, 47679, 47705, 47744, 47760, 47784, 47785, 47801, 47838, 47915, 47936, 47941, 47946, 47996, 48000, 48057, 48065, 48082, 48096, 48104, 48132, 48136, 48156, 48170, 48173, 48194, 48200, 48207, 48246, 48250, 48252, 48284, 48290, 48340, 48341, 48365, 48372, 48391, 48475, 48524, 48527, 48528, 48529, 48531, 48548, 48555, 48564, 48581, 48597, 48602, 48609, 48630, 48634, 48638, 48645, 48655, 48712, 48724, 48758, 48768, 48777, 48868, 48882, 48889, 48891, 48894, 48908, 48952, 48964, 49020, 49051, 49087, 49125, 49150, 49211, 49231, 49234, 49259, 49287, 49294, 49327, 49351, 49352, 49356, 49388, 49429, 49447, 49489, 49503, 49517, 49539, 49541, 49542, 49545, 49557, 49561, 49563, 49584, 49616, 49633, 49641, 49649, 49658, 49669, 49682, 49689, 49703, 49721, 49803, 49814, 49841, 49856, 49888, 49934, 49959, 49989, 49995, 50038, 50049, 50055, 50080, 50119, 50138, 50148, 50150, 50154, 50165, 50205, 50242]
The transcript from the Whisper model: But when I had approached so near to them, the common object, which the sense deceives, lost not by distance any of its marks.
WARNING:phonemizer:words count mismatch on 200.0% of the lines (2/1)
Version Details
Version ID
db97f6312d4c4d20e500e47fd95d8f14b00d8d28e046834faffb7999d83b6b30
Version Created
March 14, 2025
Run on Replicate →