lucataco/step-audio-tts-3b 📝❓ → 🖼️
About
Step-Audio-TTS-3B represents the industry's first Text-to-Speech (TTS) model trained on a large-scale synthetic dataset utilizing the LLM-Chat paradigm

Example Output
Output
Performance Metrics
53.35s
Prediction Time
66.04s
Total Time
All Input Parameters
{ "text": "(RAP) I set out on the journey of freedom, chasing that distant dream, breaking free from the shackles of bondage, letting my soul drift with the wind, every step is full of power, every moment is extremely shining, the belief in freedom is burning, illuminating the direction of my progress!", "speaker_name": "闫雨婷" }
Input Parameters
- text
- Text to synthesize into speech
- speaker_name
- Speaker name
Output Schema
Output
Version Details
- Version ID
8c30688893eb0a713273758033ea410a4022060142ac3ee5b93937ecbc27209f
- Version Created
- February 17, 2025