zsxkib/realistic-voice-cloning 🔢❓🖼️📝 → 🖼️

▶️ 1.1M runs 📅 Nov 2023 ⚙️ Cog 0.8.6 🔗 GitHub ⚖️ License
audio-to-audio voice-cloning voice-conversion

About

Create song covers with any RVC v2 trained AI voice from audio files.

Example Output

Output

Example output

Performance Metrics

2.81s Prediction Time
2.83s Total Time
All Input Parameters
{
  "protect": 0.33,
  "rvc_model": "Squidward",
  "index_rate": 0.5,
  "song_input": "https://replicate.delivery/pbxt/JsPIizFfRy54Jk5LuXdnrNdV1JHJ6oLmPPdRuIfh3lvpoNai/gangnam.mp3",
  "reverb_size": 0.15,
  "pitch_change": "no-change",
  "rms_mix_rate": 0.25,
  "filter_radius": 3,
  "output_format": "mp3",
  "reverb_damping": 0.7,
  "reverb_dryness": 0.8,
  "reverb_wetness": 0.2,
  "crepe_hop_length": 128,
  "pitch_change_all": 0,
  "main_vocals_volume_change": 10,
  "pitch_detection_algorithm": "rmvpe",
  "instrumental_volume_change": 0,
  "backup_vocals_volume_change": 0
}
Input Parameters
protect Type: numberDefault: 0.33Range: 0 - 0.5
Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable.
rvc_model Default: Squidward
RVC model for a specific voice. If using a custom model, this should match the name of the downloaded model. If a 'custom_rvc_model_download_url' is provided, this will be automatically set to the name of the downloaded model.
index_rate Type: numberDefault: 0.5Range: 0 - 1
Control how much of the AI's accent to leave in the vocals.
song_input Type: string
Upload your audio file here.
reverb_size Type: numberDefault: 0.15Range: 0 - 1
The larger the room, the longer the reverb time.
pitch_change Default: no-change
Adjust pitch of AI vocals. Options: `no-change`, `male-to-female`, `female-to-male`.
rms_mix_rate Type: numberDefault: 0.25Range: 0 - 1
Control how much to use the original vocal's loudness (0) or a fixed loudness (1).
filter_radius Type: integerDefault: 3Range: 0 - 7
If >=3: apply median filtering median filtering to the harvested pitch results.
output_format Default: mp3
wav for best quality and large file size, mp3 for decent quality and small file size.
reverb_damping Type: numberDefault: 0.7Range: 0 - 1
Absorption of high frequencies in the reverb.
reverb_dryness Type: numberDefault: 0.8Range: 0 - 1
Level of AI vocals without reverb.
reverb_wetness Type: numberDefault: 0.2Range: 0 - 1
Level of AI vocals with reverb.
crepe_hop_length Type: integerDefault: 128
When `pitch_detection_algo` is set to `mangio-crepe`, this controls how often it checks for pitch changes in milliseconds. Lower values lead to longer conversions and higher risk of voice cracks, but better pitch accuracy.
pitch_change_all Type: numberDefault: 0
Change pitch/key of background music, backup vocals and AI vocals in semitones. Reduces sound quality slightly.
main_vocals_volume_change Type: numberDefault: 0
Control volume of main AI vocals. Use -3 to decrease the volume by 3 decibels, or 3 to increase the volume by 3 decibels.
pitch_detection_algorithm Default: rmvpe
Best option is rmvpe (clarity in vocals), then mangio-crepe (smoother vocals).
instrumental_volume_change Type: numberDefault: 0
Control volume of the background music/instrumentals.
backup_vocals_volume_change Type: numberDefault: 0
Control volume of backup AI vocals.
custom_rvc_model_download_url Type: string
URL to download a custom RVC model. If provided, the model will be downloaded (if it doesn't already exist) and used for prediction, regardless of the 'rvc_model' value.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
[~] Starting AI Cover Generation Pipeline...
[~] Applying audio effects to Vocals...
[~] Combining AI Vocals and Instrumentals...
[~] Removing intermediate audio files...
[+] Cover generated at /src/song_output/d0386572142/tmp11cudxj_gangnam (Squidward Ver).mp3
Version Details
Version ID
0a9c7c558af4c0f20667c1bd1260ce32a2879944a0b9e44e1398660c077b1550
Version Created
November 15, 2023
Run on Replicate →