ttsds/maskgct 📝❓🖼️ → 🖼️

▶️ 493 runs 📅 Jan 2025 ⚙️ Cog 0.13.6 🔗 GitHub 📄 Paper ⚖️ License

text-to-speech voice-cloning

Performance

14.4sTypical run time

~340sCold start (first call)

493Total runs

About

The MaskGCT model by Amphion.

Example Output

Output

Performance Metrics

14.40s Prediction Time

340.11s Total Time

All Input Parameters

{
  "text": "With tenure, Suzie'd have all the more leisure for yachting, but her publications are no good.",
  "language": "en",
  "text_reference": "and keeping eternity before the eyes, though much",
  "speaker_reference": "https://replicate.delivery/pbxt/MNDu8UJR7zB1dZHG3UOPCD5B4crZunv2j32UsTd3Qd5PdG1R/example.wav"
}

Input Parameters

text (required) Type: string
language (required)
text_reference (required) Type: string
speaker_reference (required) Type: string

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

predict semantic shape torch.Size([1, 354])

Version Details

Version ID: 7bd535bef57f4ea7e45e45c73fb2fda847b8ebd27df6c9550f5ba1a1742a66f5
Version Created: January 29, 2025

Run on Replicate →