zsxkib/hololive-style-bert-vits2 ❓✓🔢📝🖼️ → 🖼️
About
🎙️Hololive text-to-speech and voice-to-voice (Japanese🇯🇵 + English🇬🇧)

Example Output
Output
Performance Metrics
4.81s
Prediction Time
275.19s
Total Time
All Input Parameters
{ "style": "Neutral", "speaker": "EN_MoriCalliope", "use_tone": false, "sdp_ratio": 0.2, "line_split": true, "style_text": "", "text_input": "Hello there! This is test audio of a new Hololive text to speech tool running on Replicate!", "noise_scale": 0.6, "length_scale": 1, "style_weight": 5, "noise_scale_w": 0.8, "split_interval": 0.5, "use_style_text": false, "style_text_weight": 0.7 }
Input Parameters
- style
- Style of speech to use (choices may be limited based on the selected speaker)
- speaker
- Default speaker
- use_tone
- Whether to use tone information in the synthesis (Japanese only)
- sdp_ratio
- Ratio for speaker-dependent processing
- line_split
- Whether to split the text into lines for processing
- style_text
- Additional text to guide the style of the synthesis
- text_input
- Text to convert to speech (text-to-voice)
- noise_scale
- Scale of noise to add to the synthesis
- length_scale
- Scale of the length of the synthesized speech
- style_weight
- Weight of the style effect
- noise_scale_w
- Scale of noise for the waveform
- split_interval
- Interval between splits when line_split is True
- use_style_text
- Whether to use additional style text in the synthesis
- style_text_weight
- Weight of the style text effect
- reference_audio_path
- Path to a reference audio file (voice-to-voice)
Output Schema
Output
Example Execution Logs
[!] model_name: SBV2_HoloLow [!] model_path: model_assets/SBV2_HoloLow/SBV2_HoloLow.safetensors [!] text: Hello there! This is test audio of a new Hololive text to speech tool running on Replicate! [!] language: EN [!] reference_audio_path: None [!] sdp_ratio: 0.2 [!] noise_scale: 0.6 [!] noise_scale_w: 0.8 [!] length_scale: 1.0 [!] line_split: True [!] split_interval: 0.5 [!] assist_text: [!] assist_text_weight: 0.7 [!] use_assist_text: False [!] style: Neutral [!] style_weight: 5.0 [!] kata_tone_json_str: [!] use_tone: False [!] speaker: MoriCalliope [!] Swapped to model 'SBV2_HoloLow' /root/.pyenv/versions/3.11.9/lib/python3.11/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv1d(input, weight, bias, self.stride, [!] Successful inference, took 3.519026s | MoriCalliope | EN/0.2/0.6/0.8/1.0/Neutral/5.0 | Hello there! This is test audio of a new Hololive text to speech tool running on Replicate!
Version Details
- Version ID
595ac4205eb84ba9330f178f2f2e4460f9ad9b67bcc8e744a7d8339f01ff24d4
- Version Created
- June 3, 2024