sakemin/all-in-one-music-structure-analyzer ✓❓🖼️ → 🖼️

▶️ 71.1K runs 📅 Dec 2023 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License
audio-embedding music-understanding

About

Cog implementation of mir-aidj(Taejun Kim)'s 'All-In-One Music Structure Analyzer'

Example Output

Output

Example outputExample outputExample output

Performance Metrics

65.82s Prediction Time
141.69s Total Time
All Input Parameters
{
  "activ": false,
  "embed": false,
  "model": "harmonix-all",
  "sonify": true,
  "visualize": true,
  "music_input": "https://replicate.delivery/pbxt/K3iP4RhDPayT24NMYswahQ7kYfG1NS4vhNaF3PZVSZoLaSSY/x2mate.com%20-%20Sean%20Lennon.%20Parachute%20%28128%20kbps%29.mp3",
  "include_embeddings": false,
  "include_activations": false
}
Input Parameters
activ Type: booleanDefault: false
Save frame-level raw activations from sigmoid and softmax
embed Type: booleanDefault: false
Save frame-level embeddings
model Default: harmonix-all
Name of the pretrained model to use
sonify Type: booleanDefault: false
Save sonifications
visualize Type: booleanDefault: false
Save visualizations
music_input Type: string
An audio file input to analyze.
include_embeddings Type: booleanDefault: false
Whether to include embeddings in the analysis results or not.
include_activations Type: booleanDefault: false
Whether to include activations in the analysis results or not.
Output Schema

Output

Type: arrayItems Type: stringItems Format: uri

Example Execution Logs
=> Found 0 tracks already analyzed and 1 tracks to analyze.
=> Found 0 tracks already demixed, 1 to demix.
Downloading: "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th" to /root/.cache/torch/hub/checkpoints/955717e8-8726e21a.th
  0%|          | 0.00/80.2M [00:00<?, ?B/s]
  0%|          | 104k/80.2M [00:00<01:21, 1.03MB/s]
  1%|          | 488k/80.2M [00:00<00:31, 2.68MB/s]
  2%|▏         | 1.96M/80.2M [00:00<00:09, 8.46MB/s]
 10%|▉         | 7.84M/80.2M [00:00<00:02, 29.2MB/s]
 38%|███▊      | 30.9M/80.2M [00:00<00:00, 105MB/s] 
 68%|██████▊   | 54.8M/80.2M [00:00<00:00, 155MB/s]
 97%|█████████▋| 77.6M/80.2M [00:00<00:00, 182MB/s]
100%|██████████| 80.2M/80.2M [00:00<00:00, 117MB/s]
Selected model is a bag of 1 models. You will see that many progress bars per track.
Separated tracks will be stored in /src/demix/htdemucs
Separating track /tmp/tmphbi1m7kax2mate.com - Sean Lennon. Parachute (128 kbps).mp3
  0%|                                                                                 | 0.0/204.75 [00:00<?, ?seconds/s]
  3%|██                                                                      | 5.85/204.75 [00:01<01:02,  3.19seconds/s]
  6%|████                                                                    | 11.7/204.75 [00:01<00:27,  7.12seconds/s]
  9%|████▉                                                     | 17.549999999999997/204.75 [00:02<00:15, 11.75seconds/s]
 11%|████████▏                                                               | 23.4/204.75 [00:02<00:10, 16.92seconds/s]
 14%|██████████▏                                                            | 29.25/204.75 [00:02<00:07, 22.37seconds/s]
 17%|█████████▉                                                | 35.099999999999994/204.75 [00:02<00:06, 27.74seconds/s]
 20%|███████████▌                                              | 40.949999999999996/204.75 [00:02<00:04, 32.84seconds/s]
 23%|████████████████▍                                                       | 46.8/204.75 [00:02<00:04, 37.21seconds/s]
 26%|██████████████████▎                                                    | 52.65/204.75 [00:02<00:03, 40.95seconds/s]
 29%|████████████████████▌                                                   | 58.5/204.75 [00:02<00:03, 43.60seconds/s]
 31%|██████████████████████▎                                                | 64.35/204.75 [00:02<00:03, 45.56seconds/s]
 34%|████████████████████▏                                      | 70.19999999999999/204.75 [00:03<00:02, 47.30seconds/s]
 37%|██████████████████████████▎                                            | 76.05/204.75 [00:03<00:02, 48.58seconds/s]
 40%|███████████████████████▌                                   | 81.89999999999999/204.75 [00:03<00:02, 49.59seconds/s]
 43%|██████████████████████████████▍                                        | 87.75/204.75 [00:03<00:02, 50.27seconds/s]
 46%|████████████████████████████████▉                                       | 93.6/204.75 [00:03<00:02, 50.95seconds/s]
 49%|████████████████████████████▋                              | 99.44999999999999/204.75 [00:03<00:02, 51.35seconds/s]
 51%|████████████████████████████████████▌                                  | 105.3/204.75 [00:03<00:01, 51.61seconds/s]
 54%|███████████████████████████████▍                          | 111.14999999999999/204.75 [00:03<00:01, 51.40seconds/s]
 57%|████████████████████████████████████████▌                              | 117.0/204.75 [00:03<00:01, 51.55seconds/s]
 60%|██████████████████████████████████████████                            | 122.85/204.75 [00:04<00:01, 52.05seconds/s]
 63%|████████████████████████████████████████████▋                          | 128.7/204.75 [00:04<00:01, 52.24seconds/s]
 66%|██████████████████████████████████████                    | 134.54999999999998/204.75 [00:04<00:01, 51.83seconds/s]
 69%|███████████████████████████████████████▊                  | 140.39999999999998/204.75 [00:04<00:01, 51.48seconds/s]
 71%|██████████████████████████████████████████████████                    | 146.25/204.75 [00:04<00:01, 51.40seconds/s]
 74%|████████████████████████████████████████████████████▋                  | 152.1/204.75 [00:04<00:01, 51.40seconds/s]
 77%|█████████████████████████████████████████████████████▉                | 157.95/204.75 [00:04<00:00, 51.45seconds/s]
 80%|██████████████████████████████████████████████▍           | 163.79999999999998/204.75 [00:04<00:00, 51.26seconds/s]
 83%|████████████████████████████████████████████████          | 169.64999999999998/204.75 [00:04<00:00, 51.05seconds/s]
 86%|████████████████████████████████████████████████████████████▊          | 175.5/204.75 [00:05<00:00, 51.64seconds/s]
 89%|██████████████████████████████████████████████████████████████        | 181.35/204.75 [00:05<00:00, 52.30seconds/s]
 91%|████████████████████████████████████████████████████████████████▉      | 187.2/204.75 [00:05<00:00, 51.43seconds/s]
 94%|██████████████████████████████████████████████████████▋   | 193.04999999999998/204.75 [00:05<00:00, 51.65seconds/s]
 97%|████████████████████████████████████████████████████████▎ | 198.89999999999998/204.75 [00:05<00:00, 49.10seconds/s]
100%|██████████████████████████████████████████████████████████████████████| 204.75/204.75 [00:05<00:00, 50.66seconds/s]
100%|██████████████████████████████████████████████████████████████████████| 204.75/204.75 [00:05<00:00, 36.00seconds/s]
=> Found 0 spectrograms already extracted, 1 to extract.
Extracting spectrograms:   0%|          | 0/1 [00:00<?, ?it/s]
Extracting spectrograms: 100%|██████████| 1/1 [00:06<00:00,  6.23s/it]
Extracting spectrograms: 100%|██████████| 1/1 [00:06<00:00,  6.23s/it]
harmonix-fold0-0vra4ys2.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold0-0vra4ys2.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 5.04MB/s]
harmonix-fold0-0vra4ys2.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 5.03MB/s]
harmonix-fold1-3ozjhtsj.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold1-3ozjhtsj.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 26.8MB/s]
harmonix-fold2-gmgo0nsy.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold2-gmgo0nsy.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 77.8MB/s]
harmonix-fold3-i92b7m8p.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold3-i92b7m8p.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 81.7MB/s]
harmonix-fold4-1bql5qo0.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold4-1bql5qo0.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 78.5MB/s]
harmonix-fold5-x4z5zeef.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold5-x4z5zeef.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 77.5MB/s]
harmonix-fold6-x7t226rq.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold6-x7t226rq.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 76.5MB/s]
harmonix-fold7-qwwskhg6.pth:   0%|          | 0.00/1.40M [00:00<?, ?B/s]
harmonix-fold7-qwwskhg6.pth: 100%|██████████| 1.40M/1.40M [00:00<00:00, 77.7MB/s]
0%|          | 0/1 [00:00<?, ?it/s]
Analyzing tmphbi1m7kax2mate.com - Sean Lennon. Parachute (128 kbps).mp3:   0%|          | 0/1 [00:00<?, ?it/s]
Analyzing tmphbi1m7kax2mate.com - Sean Lennon. Parachute (128 kbps).mp3: 100%|██████████| 1/1 [00:03<00:00,  3.43s/it]
Analyzing tmphbi1m7kax2mate.com - Sean Lennon. Parachute (128 kbps).mp3: 100%|██████████| 1/1 [00:03<00:00,  3.43s/it]
Visualizing results:   0%|          | 0/1 [00:00<?, ?it/s]
Visualizing results: 100%|██████████| 1/1 [00:10<00:00, 10.77s/it]
Visualizing results: 100%|██████████| 1/1 [00:10<00:00, 10.77s/it]
=> Plots are successfully saved to ./viz
Sonifying results:   0%|          | 0/1 [00:00<?, ?it/s]
Sonifying results: 100%|██████████| 1/1 [00:07<00:00,  7.53s/it]
Sonifying results: 100%|██████████| 1/1 [00:07<00:00,  7.53s/it]
=> Sonified tracks are successfully saved to ./sonif
Version Details
Version ID
001b4137be6ac67bdc28cb5cffacf128b874f530258d033de23121e785cb7290
Version Created
December 21, 2023
Run on Replicate →