meta/musicgen 🔢📝🖼️✓❓ → 🖼️
About
Generate music from a prompt or melody

Example Output
Prompt:
"Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic"
Output
Performance Metrics
66.37s
Prediction Time
66.46s
Total Time
All Input Parameters
{ "top_k": 250, "top_p": 0, "prompt": "Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic", "duration": 8, "temperature": 1, "continuation": false, "model_version": "stereo-large", "output_format": "mp3", "continuation_start": 0, "multi_band_diffusion": false, "normalization_strategy": "peak", "classifier_free_guidance": 3 }
Input Parameters
- seed
- Seed for random number generator. If None or -1, a random seed will be used.
- top_k
- Reduces sampling to the k most likely tokens.
- top_p
- Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.
- prompt
- A description of the music you want to generate.
- duration
- Duration of the generated audio in seconds.
- input_audio
- An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.
- temperature
- Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.
- continuation
- If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.
- model_version
- Model to use for generation
- output_format
- Output format for generated audio.
- continuation_end
- End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.
- continuation_start
- Start time of the audio file to use for continuation.
- multi_band_diffusion
- If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.
- normalization_strategy
- Strategy for normalizing audio.
- classifier_free_guidance
- Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.
Output Schema
Output
Example Execution Logs
Loading model stereo-large... Downloading models--facebook--musicgen-stereo-large to models Downloaded models--facebook--musicgen-stereo-large in 9.88s Model stereo-large loaded successfully. Using seed 187926229 1 / 400 2 / 400 3 / 400 4 / 400 5 / 400 6 / 400 7 / 400 8 / 400 9 / 400 10 / 400 11 / 400 12 / 400 13 / 400 14 / 400 15 / 400 16 / 400 17 / 400 18 / 400 19 / 400 20 / 400 21 / 400 22 / 400 23 / 400 24 / 400 25 / 400 26 / 400 27 / 400 28 / 400 29 / 400 30 / 400 31 / 400 32 / 400 33 / 400 34 / 400 35 / 400 36 / 400 37 / 400 38 / 400 39 / 400 40 / 400 41 / 400 42 / 400 43 / 400 44 / 400 45 / 400 46 / 400 47 / 400 48 / 400 49 / 400 50 / 400 51 / 400 52 / 400 53 / 400 54 / 400 55 / 400 56 / 400 57 / 400 58 / 400 59 / 400 60 / 400 61 / 400 62 / 400 63 / 400 64 / 400 65 / 400 66 / 400 67 / 400 68 / 400 69 / 400 70 / 400 71 / 400 72 / 400 73 / 400 74 / 400 75 / 400 76 / 400 77 / 400 78 / 400 79 / 400 80 / 400 81 / 400 82 / 400 83 / 400 84 / 400 85 / 400 86 / 400 87 / 400 88 / 400 89 / 400 90 / 400 91 / 400 92 / 400 93 / 400 94 / 400 95 / 400 96 / 400 97 / 400 98 / 400 99 / 400 100 / 400 101 / 400 102 / 400 103 / 400 104 / 400 105 / 400 106 / 400 107 / 400 108 / 400 109 / 400 110 / 400 111 / 400 112 / 400 113 / 400 114 / 400 115 / 400 116 / 400 117 / 400 118 / 400 119 / 400 120 / 400 121 / 400 122 / 400 123 / 400 124 / 400 125 / 400 126 / 400 127 / 400 128 / 400 129 / 400 130 / 400 131 / 400 132 / 400 133 / 400 134 / 400 135 / 400 136 / 400 137 / 400 138 / 400 139 / 400 140 / 400 141 / 400 142 / 400 143 / 400 144 / 400 145 / 400 146 / 400 147 / 400 148 / 400 149 / 400 150 / 400 151 / 400 152 / 400 153 / 400 154 / 400 155 / 400 156 / 400 157 / 400 158 / 400 159 / 400 160 / 400 161 / 400 162 / 400 163 / 400 164 / 400 165 / 400 166 / 400 167 / 400 168 / 400 169 / 400 170 / 400 171 / 400 172 / 400 173 / 400 174 / 400 175 / 400 176 / 400 177 / 400 178 / 400 179 / 400 180 / 400 181 / 400 182 / 400 183 / 400 184 / 400 185 / 400 186 / 400 187 / 400 188 / 400 189 / 400 190 / 400 191 / 400 192 / 400 193 / 400 194 / 400 195 / 400 196 / 400 197 / 400 198 / 400 199 / 400 200 / 400 201 / 400 202 / 400 203 / 400 204 / 400 205 / 400 206 / 400 207 / 400 208 / 400 209 / 400 210 / 400 211 / 400 212 / 400 213 / 400 214 / 400 215 / 400 216 / 400 217 / 400 218 / 400 219 / 400 220 / 400 221 / 400 222 / 400 223 / 400 224 / 400 225 / 400 226 / 400 227 / 400 228 / 400 229 / 400 230 / 400 231 / 400 232 / 400 233 / 400 234 / 400 235 / 400 236 / 400 237 / 400 238 / 400 239 / 400 240 / 400 241 / 400 242 / 400 243 / 400 244 / 400 245 / 400 246 / 400 247 / 400 248 / 400 249 / 400 250 / 400 251 / 400 252 / 400 253 / 400 254 / 400 255 / 400 256 / 400 257 / 400 258 / 400 259 / 400 260 / 400 261 / 400 262 / 400 263 / 400 264 / 400 265 / 400 266 / 400 267 / 400 268 / 400 269 / 400 270 / 400 271 / 400 272 / 400 273 / 400 274 / 400 275 / 400 276 / 400 277 / 400 278 / 400 279 / 400 280 / 400 281 / 400 282 / 400 283 / 400 284 / 400 285 / 400 286 / 400 287 / 400 288 / 400 289 / 400 290 / 400 291 / 400 292 / 400 293 / 400 294 / 400 295 / 400 296 / 400 297 / 400 298 / 400 299 / 400 300 / 400 301 / 400 302 / 400 303 / 400 304 / 400 305 / 400 306 / 400 307 / 400 308 / 400 309 / 400 310 / 400 311 / 400 312 / 400 313 / 400 314 / 400 315 / 400 316 / 400 317 / 400 318 / 400 319 / 400 320 / 400 321 / 400 322 / 400 323 / 400 324 / 400 325 / 400 326 / 400 327 / 400 328 / 400 329 / 400 330 / 400 331 / 400 332 / 400 333 / 400 334 / 400 335 / 400 336 / 400 337 / 400 338 / 400 339 / 400 340 / 400 341 / 400 342 / 400 343 / 400 344 / 400 345 / 400 346 / 400 347 / 400 348 / 400 349 / 400 350 / 400 351 / 400 352 / 400 353 / 400 354 / 400 355 / 400 356 / 400 357 / 400 358 / 400 359 / 400 360 / 400 361 / 400 362 / 400 363 / 400 364 / 400 365 / 400 366 / 400 367 / 400 368 / 400 369 / 400 370 / 400 371 / 400 372 / 400 373 / 400 374 / 400 375 / 400 376 / 400 377 / 400 378 / 400 379 / 400 380 / 400 381 / 400 382 / 400 383 / 400 384 / 400 385 / 400 386 / 400 387 / 400 388 / 400 389 / 400 390 / 400 391 / 400 392 / 400 393 / 400 394 / 400 395 / 400 396 / 400 397 / 400 398 / 400 399 / 400 400 / 400 401 / 400 402 / 400 403 / 400 ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers built with gcc 11 (Ubuntu 11.2.0-19ubuntu1) configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 70.100 / 56. 70.100 libavcodec 58.134.100 / 58.134.100 libavformat 58. 76.100 / 58. 76.100 libavdevice 58. 13.100 / 58. 13.100 libavfilter 7.110.100 / 7.110.100 libswscale 5. 9.100 / 5. 9.100 libswresample 3. 9.100 / 3. 9.100 libpostproc 55. 9.100 / 55. 9.100 Guessed Channel Layout for Input Stream #0.0 : stereo Input #0, wav, from 'out.wav': Metadata: encoder : Lavf58.76.100 Duration: 00:00:08.00, bitrate: 1024 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, stereo, s16, 1024 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame)) Press [q] to stop, [?] for help Output #0, mp3, to 'out.mp3': Metadata: TSSE : Lavf58.76.100 Stream #0:0: Audio: mp3, 32000 Hz, stereo, s16p Metadata: encoder : Lavc58.134.100 libmp3lame size= 0kB time=00:00:00.00 bitrate=N/A speed=N/A size= 95kB time=00:00:07.99 bitrate= 97.1kbits/s speed=45.6x video:0kB audio:94kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.269717%
Version Details
- Version ID
671ac645ce5e552cc63a54a2bbff63fcf798043055d2dac5fc9e36a837eedcfb
- Version Created
- March 28, 2024