meta/musicgen 🔢📝🖼️✓❓ → 🖼️

▶️ 3.3M runs 📅 Jun 2023 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License

audio-to-audio music-generation

About

Generate music from a prompt or melody

Example Output

Prompt:

"Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic"

Output

Performance Metrics

66.37s Prediction Time

66.46s Total Time

All Input Parameters

{
  "top_k": 250,
  "top_p": 0,
  "prompt": "Edo25 major g melodies that sound triumphant and cinematic. Leading up to a crescendo that resolves in a 9th harmonic",
  "duration": 8,
  "temperature": 1,
  "continuation": false,
  "model_version": "stereo-large",
  "output_format": "mp3",
  "continuation_start": 0,
  "multi_band_diffusion": false,
  "normalization_strategy": "peak",
  "classifier_free_guidance": 3
}

Input Parameters

seed Type: integer: Seed for random number generator. If None or -1, a random seed will be used.
top_k Type: integerDefault: 250: Reduces sampling to the k most likely tokens.
top_p Type: numberDefault: 0: Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.
prompt Type: string: A description of the music you want to generate.
duration Type: integerDefault: 8: Duration of the generated audio in seconds.
input_audio Type: string: An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.
temperature Type: numberDefault: 1: Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.
continuation Type: booleanDefault: false: If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.
model_version Default: stereo-melody-large: Model to use for generation
output_format Default: wav: Output format for generated audio.
continuation_end Type: integerRange: 0 - ∞: End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.
continuation_start Type: integerDefault: 0Range: 0 - ∞: Start time of the audio file to use for continuation.
multi_band_diffusion Type: booleanDefault: false: If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.
normalization_strategy Default: loudness: Strategy for normalizing audio.
classifier_free_guidance Type: integerDefault: 3: Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Loading model stereo-large...
Downloading models--facebook--musicgen-stereo-large to models
Downloaded models--facebook--musicgen-stereo-large in 9.88s
Model stereo-large loaded successfully.
Using seed 187926229
1 /    400
2 /    400
3 /    400
4 /    400
5 /    400
6 /    400
7 /    400
8 /    400
9 /    400
10 /    400
11 /    400
12 /    400
13 /    400
14 /    400
15 /    400
16 /    400
17 /    400
18 /    400
19 /    400
20 /    400
21 /    400
22 /    400
23 /    400
24 /    400
25 /    400
26 /    400
27 /    400
28 /    400
29 /    400
30 /    400
31 /    400
32 /    400
33 /    400
34 /    400
35 /    400
36 /    400
37 /    400
38 /    400
39 /    400
40 /    400
41 /    400
42 /    400
43 /    400
44 /    400
45 /    400
46 /    400
47 /    400
48 /    400
49 /    400
50 /    400
51 /    400
52 /    400
53 /    400
54 /    400
55 /    400
56 /    400
57 /    400
58 /    400
59 /    400
60 /    400
61 /    400
62 /    400
63 /    400
64 /    400
65 /    400
66 /    400
67 /    400
68 /    400
69 /    400
70 /    400
71 /    400
72 /    400
73 /    400
74 /    400
75 /    400
76 /    400
77 /    400
78 /    400
79 /    400
80 /    400
81 /    400
82 /    400
83 /    400
    84 /    400
85 /    400
86 /    400
87 /    400
88 /    400
89 /    400
90 /    400
91 /    400
92 /    400
93 /    400
94 /    400
95 /    400
96 /    400
97 /    400
98 /    400
99 /    400
100 /    400
101 /    400
102 /    400
103 /    400
104 /    400
105 /    400
106 /    400
107 /    400
108 /    400
109 /    400
110 /    400
111 /    400
112 /    400
113 /    400
114 /    400
115 /    400
116 /    400
117 /    400
118 /    400
119 /    400
120 /    400
121 /    400
122 /    400
123 /    400
124 /    400
125 /    400
126 /    400
127 /    400
128 /    400
129 /    400
130 /    400
131 /    400
132 /    400
133 /    400
134 /    400
135 /    400
136 /    400
137 /    400
138 /    400
139 /    400
140 /    400
141 /    400
142 /    400
143 /    400
144 /    400
145 /    400
146 /    400
147 /    400
148 /    400
149 /    400
150 /    400
151 /    400
152 /    400
153 /    400
154 /    400
155 /    400
156 /    400
157 /    400
158 /    400
159 /    400
160 /    400
161 /    400
162 /    400
163 /    400
164 /    400
165 /    400
166 /    400
167 /    400
168 /    400
169 /    400
170 /    400
171 /    400
172 /    400
173 /    400
174 /    400
175 /    400
176 /    400
177 /    400
178 /    400
179 /    400
180 /    400
181 /    400
182 /    400
183 /    400
184 /    400
185 /    400
186 /    400
187 /    400
188 /    400
189 /    400
190 /    400
191 /    400
192 /    400
193 /    400
194 /    400
195 /    400
196 /    400
197 /    400
198 /    400
199 /    400
200 /    400
201 /    400
202 /    400
203 /    400
204 /    400
205 /    400
206 /    400
207 /    400
208 /    400
209 /    400
210 /    400
211 /    400
212 /    400
213 /    400
214 /    400
215 /    400
216 /    400
217 /    400
218 /    400
219 /    400
220 /    400
221 /    400
222 /    400
223 /    400
224 /    400
225 /    400
226 /    400
227 /    400
228 /    400
229 /    400
230 /    400
231 /    400
232 /    400
233 /    400
234 /    400
235 /    400
236 /    400
237 /    400
238 /    400
239 /    400
240 /    400
241 /    400
242 /    400
243 /    400
244 /    400
245 /    400
246 /    400
247 /    400
248 /    400
249 /    400
250 /    400
251 /    400
252 /    400
253 /    400
254 /    400
255 /    400
256 /    400
257 /    400
258 /    400
259 /    400
260 /    400
261 /    400
262 /    400
263 /    400
264 /    400
265 /    400
266 /    400
267 /    400
268 /    400
269 /    400
270 /    400
271 /    400
272 /    400
273 /    400
274 /    400
275 /    400
276 /    400
277 /    400
278 /    400
279 /    400
280 /    400
281 /    400
282 /    400
283 /    400
284 /    400
285 /    400
286 /    400
287 /    400
288 /    400
289 /    400
290 /    400
291 /    400
292 /    400
293 /    400
294 /    400
295 /    400
296 /    400
297 /    400
298 /    400
299 /    400
300 /    400
301 /    400
302 /    400
303 /    400
304 /    400
305 /    400
306 /    400
307 /    400
308 /    400
309 /    400
310 /    400
311 /    400
312 /    400
313 /    400
314 /    400
315 /    400
316 /    400
317 /    400
318 /    400
319 /    400
320 /    400
321 /    400
322 /    400
323 /    400
324 /    400
325 /    400
326 /    400
327 /    400
328 /    400
329 /    400
330 /    400
331 /    400
332 /    400
333 /    400
334 /    400
335 /    400
336 /    400
337 /    400
338 /    400
339 /    400
340 /    400
341 /    400
342 /    400
343 /    400
344 /    400
345 /    400
346 /    400
347 /    400
348 /    400
349 /    400
350 /    400
351 /    400
352 /    400
353 /    400
354 /    400
355 /    400
356 /    400
357 /    400
358 /    400
359 /    400
360 /    400
361 /    400
362 /    400
363 /    400
364 /    400
365 /    400
366 /    400
367 /    400
368 /    400
369 /    400
370 /    400
371 /    400
372 /    400
373 /    400
374 /    400
375 /    400
376 /    400
377 /    400
378 /    400
379 /    400
380 /    400
381 /    400
382 /    400
383 /    400
384 /    400
385 /    400
386 /    400
387 /    400
388 /    400
389 /    400
390 /    400
391 /    400
392 /    400
393 /    400
394 /    400
395 /    400
396 /    400
397 /    400
398 /    400
399 /    400
400 /    400
401 /    400
402 /    400
403 /    400
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'out.wav':
Metadata:
encoder         : Lavf58.76.100
Duration: 00:00:08.00, bitrate: 1024 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, stereo, s16, 1024 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
Output #0, mp3, to 'out.mp3':
Metadata:
TSSE            : Lavf58.76.100
Stream #0:0: Audio: mp3, 32000 Hz, stereo, s16p
Metadata:
encoder         : Lavc58.134.100 libmp3lame
size=       0kB time=00:00:00.00 bitrate=N/A speed=N/A
size=      95kB time=00:00:07.99 bitrate=  97.1kbits/s speed=45.6x
video:0kB audio:94kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.269717%

Version Details

Version ID: 671ac645ce5e552cc63a54a2bbff63fcf798043055d2dac5fc9e36a837eedcfb
Version Created: March 28, 2024

Run on Replicate →