meta/musicgen 🔢📝🖼️✓❓ → 🖼️
About
Generate music from a prompt or melody
Example Output
Prompt:
"Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic"
Output
Performance Metrics
66.37s
Prediction Time
66.46s
Total Time
All Input Parameters
{
"top_k": 250,
"top_p": 0,
"prompt": "Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic",
"duration": 8,
"temperature": 1,
"continuation": false,
"model_version": "stereo-large",
"output_format": "mp3",
"continuation_start": 0,
"multi_band_diffusion": false,
"normalization_strategy": "peak",
"classifier_free_guidance": 3
}
Input Parameters
- seed
- Seed for random number generator. If None or -1, a random seed will be used.
- top_k
- Reduces sampling to the k most likely tokens.
- top_p
- Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.
- prompt
- A description of the music you want to generate.
- duration
- Duration of the generated audio in seconds.
- input_audio
- An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.
- temperature
- Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.
- continuation
- If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.
- model_version
- Model to use for generation
- output_format
- Output format for generated audio.
- continuation_end
- End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.
- continuation_start
- Start time of the audio file to use for continuation.
- multi_band_diffusion
- If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.
- normalization_strategy
- Strategy for normalizing audio.
- classifier_free_guidance
- Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.
Output Schema
Output
Example Execution Logs
Loading model stereo-large...
Downloading models--facebook--musicgen-stereo-large to models
Downloaded models--facebook--musicgen-stereo-large in 9.88s
Model stereo-large loaded successfully.
Using seed 187926229
1 / 400
2 / 400
3 / 400
4 / 400
5 / 400
6 / 400
7 / 400
8 / 400
9 / 400
10 / 400
11 / 400
12 / 400
13 / 400
14 / 400
15 / 400
16 / 400
17 / 400
18 / 400
19 / 400
20 / 400
21 / 400
22 / 400
23 / 400
24 / 400
25 / 400
26 / 400
27 / 400
28 / 400
29 / 400
30 / 400
31 / 400
32 / 400
33 / 400
34 / 400
35 / 400
36 / 400
37 / 400
38 / 400
39 / 400
40 / 400
41 / 400
42 / 400
43 / 400
44 / 400
45 / 400
46 / 400
47 / 400
48 / 400
49 / 400
50 / 400
51 / 400
52 / 400
53 / 400
54 / 400
55 / 400
56 / 400
57 / 400
58 / 400
59 / 400
60 / 400
61 / 400
62 / 400
63 / 400
64 / 400
65 / 400
66 / 400
67 / 400
68 / 400
69 / 400
70 / 400
71 / 400
72 / 400
73 / 400
74 / 400
75 / 400
76 / 400
77 / 400
78 / 400
79 / 400
80 / 400
81 / 400
82 / 400
83 / 400
84 / 400
85 / 400
86 / 400
87 / 400
88 / 400
89 / 400
90 / 400
91 / 400
92 / 400
93 / 400
94 / 400
95 / 400
96 / 400
97 / 400
98 / 400
99 / 400
100 / 400
101 / 400
102 / 400
103 / 400
104 / 400
105 / 400
106 / 400
107 / 400
108 / 400
109 / 400
110 / 400
111 / 400
112 / 400
113 / 400
114 / 400
115 / 400
116 / 400
117 / 400
118 / 400
119 / 400
120 / 400
121 / 400
122 / 400
123 / 400
124 / 400
125 / 400
126 / 400
127 / 400
128 / 400
129 / 400
130 / 400
131 / 400
132 / 400
133 / 400
134 / 400
135 / 400
136 / 400
137 / 400
138 / 400
139 / 400
140 / 400
141 / 400
142 / 400
143 / 400
144 / 400
145 / 400
146 / 400
147 / 400
148 / 400
149 / 400
150 / 400
151 / 400
152 / 400
153 / 400
154 / 400
155 / 400
156 / 400
157 / 400
158 / 400
159 / 400
160 / 400
161 / 400
162 / 400
163 / 400
164 / 400
165 / 400
166 / 400
167 / 400
168 / 400
169 / 400
170 / 400
171 / 400
172 / 400
173 / 400
174 / 400
175 / 400
176 / 400
177 / 400
178 / 400
179 / 400
180 / 400
181 / 400
182 / 400
183 / 400
184 / 400
185 / 400
186 / 400
187 / 400
188 / 400
189 / 400
190 / 400
191 / 400
192 / 400
193 / 400
194 / 400
195 / 400
196 / 400
197 / 400
198 / 400
199 / 400
200 / 400
201 / 400
202 / 400
203 / 400
204 / 400
205 / 400
206 / 400
207 / 400
208 / 400
209 / 400
210 / 400
211 / 400
212 / 400
213 / 400
214 / 400
215 / 400
216 / 400
217 / 400
218 / 400
219 / 400
220 / 400
221 / 400
222 / 400
223 / 400
224 / 400
225 / 400
226 / 400
227 / 400
228 / 400
229 / 400
230 / 400
231 / 400
232 / 400
233 / 400
234 / 400
235 / 400
236 / 400
237 / 400
238 / 400
239 / 400
240 / 400
241 / 400
242 / 400
243 / 400
244 / 400
245 / 400
246 / 400
247 / 400
248 / 400
249 / 400
250 / 400
251 / 400
252 / 400
253 / 400
254 / 400
255 / 400
256 / 400
257 / 400
258 / 400
259 / 400
260 / 400
261 / 400
262 / 400
263 / 400
264 / 400
265 / 400
266 / 400
267 / 400
268 / 400
269 / 400
270 / 400
271 / 400
272 / 400
273 / 400
274 / 400
275 / 400
276 / 400
277 / 400
278 / 400
279 / 400
280 / 400
281 / 400
282 / 400
283 / 400
284 / 400
285 / 400
286 / 400
287 / 400
288 / 400
289 / 400
290 / 400
291 / 400
292 / 400
293 / 400
294 / 400
295 / 400
296 / 400
297 / 400
298 / 400
299 / 400
300 / 400
301 / 400
302 / 400
303 / 400
304 / 400
305 / 400
306 / 400
307 / 400
308 / 400
309 / 400
310 / 400
311 / 400
312 / 400
313 / 400
314 / 400
315 / 400
316 / 400
317 / 400
318 / 400
319 / 400
320 / 400
321 / 400
322 / 400
323 / 400
324 / 400
325 / 400
326 / 400
327 / 400
328 / 400
329 / 400
330 / 400
331 / 400
332 / 400
333 / 400
334 / 400
335 / 400
336 / 400
337 / 400
338 / 400
339 / 400
340 / 400
341 / 400
342 / 400
343 / 400
344 / 400
345 / 400
346 / 400
347 / 400
348 / 400
349 / 400
350 / 400
351 / 400
352 / 400
353 / 400
354 / 400
355 / 400
356 / 400
357 / 400
358 / 400
359 / 400
360 / 400
361 / 400
362 / 400
363 / 400
364 / 400
365 / 400
366 / 400
367 / 400
368 / 400
369 / 400
370 / 400
371 / 400
372 / 400
373 / 400
374 / 400
375 / 400
376 / 400
377 / 400
378 / 400
379 / 400
380 / 400
381 / 400
382 / 400
383 / 400
384 / 400
385 / 400
386 / 400
387 / 400
388 / 400
389 / 400
390 / 400
391 / 400
392 / 400
393 / 400
394 / 400
395 / 400
396 / 400
397 / 400
398 / 400
399 / 400
400 / 400
401 / 400
402 / 400
403 / 400
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 70.100 / 56. 70.100
libavcodec 58.134.100 / 58.134.100
libavformat 58. 76.100 / 58. 76.100
libavdevice 58. 13.100 / 58. 13.100
libavfilter 7.110.100 / 7.110.100
libswscale 5. 9.100 / 5. 9.100
libswresample 3. 9.100 / 3. 9.100
libpostproc 55. 9.100 / 55. 9.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'out.wav':
Metadata:
encoder : Lavf58.76.100
Duration: 00:00:08.00, bitrate: 1024 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, stereo, s16, 1024 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
Output #0, mp3, to 'out.mp3':
Metadata:
TSSE : Lavf58.76.100
Stream #0:0: Audio: mp3, 32000 Hz, stereo, s16p
Metadata:
encoder : Lavc58.134.100 libmp3lame
size= 0kB time=00:00:00.00 bitrate=N/A speed=N/A
size= 95kB time=00:00:07.99 bitrate= 97.1kbits/s speed=45.6x
video:0kB audio:94kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.269717%
Version Details
- Version ID
671ac645ce5e552cc63a54a2bbff63fcf798043055d2dac5fc9e36a837eedcfb- Version Created
- March 28, 2024