codehappynice/voicegenerator 📝❓✓ → ❓

▶️ 41 runs 📅 Nov 2024 ⚙️ Cog 0.9.26

multilingual-tts text-to-speech voice-cloning

Performance

3.7sTypical run time

~243sCold start (first call)

41Total runs

About

This model is used to generate speech

Example Output

Output

Performance Metrics

3.67s Prediction Time

242.79s Total Time

All Input Parameters

{
  "text": "Hi there, I'm your new voice clone. Try your best to upload quality audio",
  "speaker": "https://audioaiforyou.s3.us-east-2.amazonaws.com/voicemodel/female.wav",
  "language": "en",
  "cleanup_voice": false
}

Input Parameters

text Type: stringDefault: Hi there, I'm your new voice clone. Try your best to upload quality audio: Text to synthesize
speaker (required) Type: string: Original speaker audio url (wav, mp3, m4a, ogg, or flv). Duration should be at least 6 seconds.
language Default: en: Output language for the synthesised speech
cleanup_voice Type: booleanDefault: false: Whether to apply denoising to the speaker audio (microphone recordings)

Output Schema

Output

Example Execution Logs

ZIP File Saved: ./demo_file/female.wav
sh: 1: ffmpeg: not found
> Text splitted to sentences.
["Hi there, I'm your new voice clone.", 'Try your best to upload quality audio']
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
> Processing time: 3.1524736881256104
> Real-time factor: 0.5631971482302446
delete video success

Version Details

Version ID: 2f090b152150d1b55b7d5234ec80f2e5661a8cbb6b64112013c5fe47a36c1fa5
Version Created: November 12, 2024

Run on Replicate →