e7mac/omnizart ❓🖼️✓ → ❓

▶️ 292 runs 📅 Aug 2023 ⚙️ Cog 0.8.6
audio-to-midi music-transcription music-understanding

About

Example Output

Output

Example output

Performance Metrics

23.27s Prediction Time
23.27s Total Time
All Input Parameters
{
  "mode": "chord",
  "audio": "https://replicate.delivery/pbxt/JN0hxKyz6zBscSeQ4HEQi5hzfOgFRKtGJ05a70zEoM0rXWQz/Carolina%20Gaita%CC%81n%20-%20La%20Gaita%20-%20We%20Don%27t%20Talk%20About%20Bruno.mp3"
}
Input Parameters
mode Default: music-piano-v2
Transcription mode
audio (required) Type: string
Path to the input music. Supports mp3 and wav format.
render_audio Type: booleanDefault: false
Option to render to mp3
Output Schema
csv Type: stringFormat: uri
Csv
wav Type: stringFormat: uri
Wav
midi Type: stringFormat: uri
Midi
Example Execution Logs
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
Input #0, mp3, from '/tmp/tmp1haztl3aCarolina Gaitán - La Gaita - We Don't Talk About Bruno.mp3':
Metadata:
track           : 4/44
title           : We Don't Talk About Bruno
album           : Encanto (Original Motion Picture Soundtrack)
encoder         : Lavf58.76.100
artist          : Carolina Gaitán - La Gaita, Mauro Castillo, Adassa, Rhenzy Feliz, Diane Guerrero, Stephanie Beatriz, Encanto - Cast
disc            : 1
date            : 2021-11-19
Duration: 00:03:41.38, start: 0.023021, bitrate: 233 kb/s
Stream #0:0: Audio: mp3, 48000 Hz, stereo, fltp, 228 kb/s
Metadata:
encoder         : Lavc58.13
Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 640x640 [SAR 600:600 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
title           : Album cover
comment         : Cover (front)
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'cog_temp/tmp1haztl3aCarolina Gaitán - La Gaita - We Don't Talk About Bruno.wav':
Metadata:
IPRT            : 4/44
INAM            : We Don't Talk About Bruno
IPRD            : Encanto (Original Motion Picture Soundtrack)
ICRD            : 2021-11-19
IART            : Carolina Gaitán - La Gaita, Mauro Castillo, Adassa, Rhenzy Feliz, Diane Guerrero, Stephanie Beatriz, Encanto - Cast
disc            : 1
ISFT            : Lavf58.76.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
Metadata:
encoder         : Lavc58.134.100 pcm_s16le
size=       1kB time=00:00:00.00 bitrate=N/A speed=   0x
size=   41500kB time=00:03:41.32 bitrate=1536.0kbits/s speed= 938x
video:0kB audio:41500kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000767%
2023-08-18 06:01:10 Extracting feature
INFO:Chord Application:Extracting feature
2023-08-18 06:01:13 Loading model
INFO:Chord Application:Loading model
2023-08-18 06:01:13 Using built-in model /src/omnizart/checkpoints/chord/chord_v1 for transcription.
INFO:Base Class:Using built-in model /src/omnizart/checkpoints/chord/chord_v1 for transcription.
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory.
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory.
WARNING:absl:Importing a function (__inference_encoder_layer_call_and_return_conditional_losses_19575) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_encoder_layer_call_and_return_conditional_losses_21695) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_encoder_layer_call_and_return_conditional_losses_76469) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_encoder_layer_call_and_return_conditional_losses_74349) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_chord_model_layer_call_and_return_conditional_losses_63102) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_chord_model_layer_call_and_return_conditional_losses_45289) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_chord_model_layer_call_and_return_conditional_losses_71739) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_chord_model_layer_call_and_return_conditional_losses_53926) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference__wrapped_model_17395) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
2023-08-18 06:01:26 Preparing feature for model prediction
INFO:Chord Application:Preparing feature for model prediction
2023-08-18 06:01:26 Predicting...
INFO:Chord Application:Predicting...
1/2 [==============>...............] - ETA: 1s
2/2 [==============================] - 1s 20ms/step
2023-08-18 06:01:28 Infering chords...
INFO:Chord Application:Infering chords...
2023-08-18 06:01:28 MIDI file has been written to ./tmp1haztl3aCarolina Gaitán - La Gaita - We Don't Talk About Bruno.mid.
INFO:Base Class:MIDI file has been written to ./tmp1haztl3aCarolina Gaitán - La Gaita - We Don't Talk About Bruno.mid.
2023-08-18 06:01:28 MIDI and CSV file have been written to /src
INFO:Chord Application:MIDI and CSV file have been written to /src
2023-08-18 06:01:28 Transcription finished
INFO:Chord Application:Transcription finished
Synthesizing MIDI...
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'cog_temp/tmp1haztl3aCarolina Gaitán - La Gaita - We Don't Talk About Bruno_synth.wav':
Duration: 00:03:34.90, bitrate: 2822 kb/s
Stream #0:0: Audio: pcm_f64le ([3][0][0][0] / 0x0003), 44100 Hz, mono, dbl, 2822 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_f64le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
Output #0, mp3, to '/tmp/tmpww3zelat/out.mp3':
Metadata:
TSSE            : Lavf58.76.100
Stream #0:0: Audio: mp3, 44100 Hz, mono, fltp
Metadata:
encoder         : Lavc58.134.100 libmp3lame
size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x
size=     512kB time=00:01:15.23 bitrate=  55.8kbits/s speed= 150x
size=    1024kB time=00:02:28.61 bitrate=  56.4kbits/s speed= 149x
size=    1679kB time=00:03:34.88 bitrate=  64.0kbits/s speed= 149x
video:0kB audio:1679kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.013202%
Version Details
Version ID
6531a3a74e8b3bdde76dc3d9fa30f3d1fa10c8a7717f1e1de0a11e7f1194cc75
Version Created
August 18, 2023
Run on Replicate →