lucataco/ace-step 🔢📝❓ → 🖼️

▶️ 110.4K runs 📅 May 2025 ⚙️ Cog 0.14.10 🔗 GitHub 📄 Paper ⚖️ License

music-generation singing-voice-generation text-to-music

About

A Step Towards Music Generation Foundation Model text2music

Example Output

Output

Performance Metrics

6.18s Prediction Time

45.84s Total Time

All Input Parameters

{
  "seed": -1,
  "tags": "synth-pop, electronic, pop, synthesizer, drums, bass, piano, 128 BPM, energetic, uplifting, modern",
  "lyrics": "[verse]\nWoke up in a city that's always alive\nNeon lights they shimmer they thrive\nElectric pulses beat they drive\nMy heart races just to survive\n\n[chorus]\nOh electric dreams they keep me high\nThrough the wires I soar and fly\nMidnight rhythms in the sky\nElectric dreams together we’ll defy\n\n[verse]\nLost in the labyrinth of screens\nVirtual love or so it seems\nIn the night the city gleams\nDigital faces haunted by memes\n\n[chorus]\nOh electric dreams they keep me high\nThrough the wires I soar and fly\nMidnight rhythms in the sky\nElectric dreams together we’ll defy\n\n[bridge]\nSilent whispers in my ear\nPixelated love serene and clear\nThrough the chaos find you near\nIn electric dreams no fear\n\n[verse]\nBound by circuits intertwined\nLove like ours is hard to find\nIn this world we’re truly blind\nBut electric dreams free the mind",
  "duration": 60,
  "scheduler": "euler",
  "guidance_type": "apg",
  "guidance_scale": 15,
  "number_of_steps": 60,
  "granularity_scale": 10,
  "guidance_interval": 0.5,
  "min_guidance_scale": 3,
  "tag_guidance_scale": 0,
  "lyric_guidance_scale": 0,
  "guidance_interval_decay": 0
}

Input Parameters

seed Type: integerDefault: -1: Random seed. Set to -1 to randomize.
tags (required) Type: string: Text prompts to guide music generation, e.g., 'epic,cinematic'
lyrics Type: string: Lyrics for the music. Use [verse], [chorus], and [bridge] to separate different parts of the lyrics. Use [instrumental] or [inst] to generate instrumental music
duration Type: numberDefault: 60Range: 1 - 240: Duration of the generated audio in seconds. -1 means a random duration between 30 and 240 seconds.
scheduler Default: euler: Scheduler type.
guidance_type Default: apg: Guidance type for CFG.
guidance_scale Type: numberDefault: 15Range: 0 - 30: Overall guidance scale.
number_of_steps Type: integerDefault: 60Range: 10 - 200: Number of inference steps.
granularity_scale Type: numberDefault: 10Range: -100 - 100: Omega scale for APG guidance, or similar for other CFG types.
guidance_interval Type: numberDefault: 0.5Range: 0 - 1: Guidance interval.
min_guidance_scale Type: numberDefault: 3Range: 0 - 100: Minimum guidance scale.
tag_guidance_scale Type: numberDefault: 0Range: 0 - 10: Guidance scale for tags (text prompt).
lyric_guidance_scale Type: numberDefault: 0Range: 0 - 10: Guidance scale for lyrics.
guidance_interval_decay Type: numberDefault: 0Range: 0 - 1: Guidance interval decay.

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Using seed: 6796391
Generating audio with duration: 60.0s, tags: 'synth-pop, electronic, pop, synthesizer, drums, bass, piano, 128 BPM, energetic, uplifting, modern', lyrics: 'True'
2025-05-14 15:17:23.278 | INFO     | pipeline_ace_step:text2music_diffusion_process:896 - cfg_type: apg, guidance_scale: 15.0, omega_scale: 10
2025-05-14 15:17:23.290 | INFO     | pipeline_ace_step:text2music_diffusion_process:1114 - start_idx: 15, end_idx: 45, num_inference_steps: 60
  0%|          | 0/60 [00:00<?, ?it/s]
  2%|▏         | 1/60 [00:00<00:23,  2.52it/s]
  7%|▋         | 4/60 [00:00<00:05,  9.39it/s]
 12%|█▏        | 7/60 [00:00<00:03, 13.84it/s]
 17%|█▋        | 10/60 [00:00<00:02, 16.81it/s]
 22%|██▏       | 13/60 [00:00<00:02, 18.81it/s]
 27%|██▋       | 16/60 [00:01<00:02, 16.58it/s]
 30%|███       | 18/60 [00:01<00:02, 15.50it/s]
 33%|███▎      | 20/60 [00:01<00:02, 14.25it/s]
 37%|███▋      | 22/60 [00:01<00:02, 13.42it/s]
 40%|████      | 24/60 [00:01<00:02, 12.85it/s]
 43%|████▎     | 26/60 [00:01<00:02, 12.47it/s]
 47%|████▋     | 28/60 [00:02<00:02, 12.22it/s]
 50%|█████     | 30/60 [00:02<00:02, 12.03it/s]
 53%|█████▎    | 32/60 [00:02<00:02, 11.90it/s]
 57%|█████▋    | 34/60 [00:02<00:02, 11.80it/s]
 60%|██████    | 36/60 [00:02<00:02, 11.74it/s]
 63%|██████▎   | 38/60 [00:02<00:01, 11.70it/s]
 67%|██████▋   | 40/60 [00:03<00:01, 11.66it/s]
 70%|███████   | 42/60 [00:03<00:01, 11.65it/s]
 73%|███████▎  | 44/60 [00:03<00:01, 11.63it/s]
 77%|███████▋  | 46/60 [00:03<00:01, 12.49it/s]
 82%|████████▏ | 49/60 [00:03<00:00, 15.23it/s]
 87%|████████▋ | 52/60 [00:03<00:00, 17.35it/s]
 92%|█████████▏| 55/60 [00:04<00:00, 18.55it/s]
 97%|█████████▋| 58/60 [00:04<00:00, 20.21it/s]
100%|██████████| 60/60 [00:04<00:00, 14.10it/s]
0%|          | 0/1 [00:00<?, ?it/s]2025-05-14 15:17:28.229 | WARNING  | pipeline_ace_step:save_wav_file:1419 - save_path is None, using default path ./outputs/
100%|██████████| 1/1 [00:00<00:00,  1.07it/s]
100%|██████████| 1/1 [00:00<00:00,  1.07it/s]
Audio generated at: ./outputs/output_20250514151728_0.mp3

Version Details

Version ID: 280fc4f9ee507577f880a167f639c02622421d8fecf492454320311217b688f1
Version Created: May 14, 2025

Run on Replicate →