zsxkib/realistic-voice-cloning 🔢❓🖼️📝 → 🖼️

▶️ 1.6M runs 📅 Nov 2023 ⚙️ Cog 0.8.6 🔗 GitHub ⚖️ License

audio-to-audio voice-cloning voice-conversion

About

Create song covers with any RVC v2 trained AI voice from audio files.

Example Output

Output

Performance Metrics

2.81s Prediction Time

2.83s Total Time

All Input Parameters

{
  "protect": 0.33,
  "rvc_model": "Squidward",
  "index_rate": 0.5,
  "song_input": "https://replicate.delivery/pbxt/JsPIizFfRy54Jk5LuXdnrNdV1JHJ6oLmPPdRuIfh3lvpoNai/gangnam.mp3",
  "reverb_size": 0.15,
  "pitch_change": "no-change",
  "rms_mix_rate": 0.25,
  "filter_radius": 3,
  "output_format": "mp3",
  "reverb_damping": 0.7,
  "reverb_dryness": 0.8,
  "reverb_wetness": 0.2,
  "crepe_hop_length": 128,
  "pitch_change_all": 0,
  "main_vocals_volume_change": 10,
  "pitch_detection_algorithm": "rmvpe",
  "instrumental_volume_change": 0,
  "backup_vocals_volume_change": 0
}

Input Parameters

protect Type: numberDefault: 0.33Range: 0 - 0.5: Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable.
rvc_model Default: Squidward: RVC model for a specific voice. If using a custom model, this should match the name of the downloaded model. If a 'custom_rvc_model_download_url' is provided, this will be automatically set to the name of the downloaded model.
index_rate Type: numberDefault: 0.5Range: 0 - 1: Control how much of the AI's accent to leave in the vocals.
song_input Type: string: Upload your audio file here.
reverb_size Type: numberDefault: 0.15Range: 0 - 1: The larger the room, the longer the reverb time.
pitch_change Default: no-change: Adjust pitch of AI vocals. Options: `no-change`, `male-to-female`, `female-to-male`.
rms_mix_rate Type: numberDefault: 0.25Range: 0 - 1: Control how much to use the original vocal's loudness (0) or a fixed loudness (1).
filter_radius Type: integerDefault: 3Range: 0 - 7: If >=3: apply median filtering median filtering to the harvested pitch results.
output_format Default: mp3: wav for best quality and large file size, mp3 for decent quality and small file size.
reverb_damping Type: numberDefault: 0.7Range: 0 - 1: Absorption of high frequencies in the reverb.
reverb_dryness Type: numberDefault: 0.8Range: 0 - 1: Level of AI vocals without reverb.
reverb_wetness Type: numberDefault: 0.2Range: 0 - 1: Level of AI vocals with reverb.
crepe_hop_length Type: integerDefault: 128: When `pitch_detection_algo` is set to `mangio-crepe`, this controls how often it checks for pitch changes in milliseconds. Lower values lead to longer conversions and higher risk of voice cracks, but better pitch accuracy.
pitch_change_all Type: numberDefault: 0: Change pitch/key of background music, backup vocals and AI vocals in semitones. Reduces sound quality slightly.
main_vocals_volume_change Type: numberDefault: 0: Control volume of main AI vocals. Use -3 to decrease the volume by 3 decibels, or 3 to increase the volume by 3 decibels.
pitch_detection_algorithm Default: rmvpe: Best option is rmvpe (clarity in vocals), then mangio-crepe (smoother vocals).
instrumental_volume_change Type: numberDefault: 0: Control volume of the background music/instrumentals.
backup_vocals_volume_change Type: numberDefault: 0: Control volume of backup AI vocals.
custom_rvc_model_download_url Type: string: URL to download a custom RVC model. If provided, the model will be downloaded (if it doesn't already exist) and used for prediction, regardless of the 'rvc_model' value.

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

[~] Starting AI Cover Generation Pipeline...
[~] Applying audio effects to Vocals...
[~] Combining AI Vocals and Instrumentals...
[~] Removing intermediate audio files...
[+] Cover generated at /src/song_output/d0386572142/tmp11cudxj_gangnam (Squidward Ver).mp3

Version Details

Version ID: 0a9c7c558af4c0f20667c1bd1260ce32a2879944a0b9e44e1398660c077b1550
Version Created: November 15, 2023

Run on Replicate →