Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Audio Plugin Recommendation Systems for Music Production

Paulo Mateus
October 18, 2019

Audio Plugin Recommendation Systems for Music Production

Presentation of Audio Plugin Recommendation Systems for Music Production
PMM Silva, CLC Mattos, AH Souza Júnior
2019 8th Brazilian Conference on Intelligent Systems (BRACIS), 854-859

Icons
- - - -
Icons made by Freepik, itim2101, Flat Icons, xnimrodx e smalllikeart from http://www.flaticon.com

Zoom audio plugins:
- - - - - - - - - - - - - - - -
Disclaimer: Zoom is a trademark of ZOOM Corporation (hrp://www.zoom.co.jp), registered in the US and / or other countries. All product names, trademarks and registered trademarks are the property of their respective owners. All company, product, and service names used herein are for identification purposes only and are not intended to infringe the copyrights of their respective owners. The use of these names, trademarks and trademarks does not imply endorsement.

Paulo Mateus

October 18, 2019
Tweet

More Decks by Paulo Mateus

Other Decks in Science

Transcript

  1. Audio Plugin Recommendation Systems for Music Production Paulo Mateus M.

    da Silva – IFCE C´ esar Lincoln C. Mattos – UFC Amauri Holanda de S. Junior – IFCE
  2. Motivation A large number of available configurations implies that better

    use of such equipment is restricted to specialists. slots in a configuration 6 possible plugins for each slot 117 possible configurations 6 117 7 / 36
  3. Motivation Intelligent systems for music production aim to propose automatic

    tools to ease and support decision making by music actors: 1. plugin suggestions based on the choices the user has already made; 2. generate full configurations (plugins selection), and; 3. define the plugins positioning. 8 / 36
  4. Motivation Intelligent systems for music production aim to propose automatic

    tools to ease and support decision making by music actors: 1. plugin suggestions based on the choices the user has already made; 2. generate full configurations (plugins selection), and; 3. define the plugins positioning. 8 / 36
  5. Motivation Intelligent systems for music production aim to propose automatic

    tools to ease and support decision making by music actors: 1. plugin suggestions based on the choices the user has already made; 2. generate full configurations (plugins selection), and; 3. define the plugins positioning. 8 / 36
  6. Motivation Intelligent systems for music production aim to propose automatic

    tools to ease and support decision making by music actors: 1. plugin suggestions based on the choices the user has already made; 2. generate full configurations (plugins selection), and; 3. define the plugins positioning. 8 / 36
  7. Objective Recommendation systems for decision support. Two distinct approaches: •

    Supervised learning with classification models; • Collaborative filtering. 9 / 36
  8. Related work • Automatic parameter tuning to find the most

    suitable parameter values (e.g., gain, treble level) of a given plugin; • Plugins recommendations. 11 / 36
  9. Related work – Plugins recommendations S Stasis et al: Audio

    processing chain recommendation1. • Markov chains to model plugin sequences based on timbral descriptive terms, desired audio effects and music genre [1]; • Dataset: • 178 configurations collected with a web form answered by musical producers, where they had to define a chain of plugins to be applied to random audio samples2; • 4 audio plugins. 1S Stasis et al. Proceedings of the 20th International Conference on Digital Audio Effects. 2017 2Mixing Secrets – http://www.cambridge-mt.com/ms-mtk.htm 12 / 36
  10. The dataset of audio plugins Data extracted from Guitar Patches3

    • Effects unit: Zoom G3; • 117 distinct plugins; • A configuration: 6 audio plugins; • Plugin None; • 11 categories; • 5,123 configurations ) 2,161 unique configurations. 3https://www.guitarpatches.com 14 / 36
  11. Models Classification • Logistic Regression (LR); • Multilayer Perceptron (MLP);

    • Support Vector Machine (SVM). Collaborative filtering • k-Nearest Neighbors (kNN); • Restricted Boltzmann Machine for Collaborative Filtering (RBM-CF)4. 4R. Salakhutdinov et al. Restricted boltzmann machines for collaborative filtering. ICML’07. 2007. 15 / 36
  12. Collaborative filtering user-based Automatic predictions about a user’s interests based

    on users with a similar profile. A B C ? ? Users Items “In a movie, what rating would the user give?” 16 / 36
  13. Collaborative filtering in our case Automatic plugins predictions for a

    configuration, based in similar configurations. “In a position/slot, which audio plugin would be most suitable?” 17 / 36
  14. Data encoding • One-hot M-dimensional to preserve plugins positioning, but

    implies a large input dimension; • Bag-of-words small input dimension, but there are no information positioning. One-hot M-dimensional Bag-of-words 1 1 1 1 1 0 0 0 Configuration One-hot 8 4 6 1 5 2 (vector concatenation) (apply 'OR' binary operator) Class 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 19 / 36
  15. Classification: Application As a classifier is required for each position

    to be recommended, it is necessary to generate separate dataset for each position. 8 4 6 1 5 71 8 4 6 1 2 8 4 6 5 2 8 4 1 5 2 8 48 6 1 5 2 56 4 6 1 5 2 Generate incomplete configurations Missing audio plugin 8 4 6 1 5 2 21 / 36
  16. Collaborative filtering: Application Applied models receive in the input all

    the attributes. A unique model to obtain predictions! Attributes Encoding Model 8 4 6 1 5 2 22 / 36
  17. Collaborative filtering: kNN Hamming distance metric. 6 4 1 5

    2 20 7 4 7 10 2 6 7 2 3 5 1 3 11 2 9 6 5 20 5 7 2 6 2 9 k most similar configurations kNN 20 7 3 9 40% ? 4 6 1 5 2 Recommendation list 20% 20% 20% 23 / 36
  18. Evaluation metrics 1. Accuracy: Expected item is in the first

    position? 2. Hit@5: Expected item is in the top 5? 3. MDCG5: How close is the item to the top positions? 4. MAP@56: Proportion of the top 5 recommended plugins that are of the same category of the expected item. 5Mean (Normalized) Discounted Cumulative Gain 6Mean Average Precision 25 / 36
  19. Experiments 1. 5-fold cross-validation: 4 folds for train and 1

    for test; 2. Hyperparameters selection: 2-fold cross-validation using training data; 3. Generate of incomplete configurations; 8 4 6 1 5 71 8 4 6 1 2 8 4 6 5 2 8 4 1 5 2 8 48 6 1 5 2 56 4 6 1 5 2 Generate incomplete configurations Missing audio plugin 8 4 6 1 5 2 4. Evaluation: Average of the five training samples (generated by the cross-validation). 27 / 36
  20. Hyperparameters search Table 1: Hyperparameters tested in the grid search

    step. Model Hyperparameter Values LR Otimizer Coordinate Descent Class selection one-vs-all MLP # hidden layers 1 # hidden units 20, 40, 80, 100 SVM C 2−3, 20, · · · , 212 γ 2−13, 2−10, · · · , 22 Class selection one-vs-all kNN # neighbors 1, 5, · · · , 25, 40, 60, 80, 100 Distance metric Hamming distance [2] RBM-CF CD 1 # unit elements 10.000 Mini-batch size 10 Learning rate Adam(lr=0.05) # eppoch Defined by early stopping 28 / 36
  21. Plugin recommendation on a single position Table 2: Summary of

    the experiments (averages and correspondent standard deviations) for recommendation on a single position. Best results are in boldface. Model Encod. Accur. (%) Hit@5 (%) MDCG (×10−2) MAP@5 (%) LR 1-hot 39.0 ± 12.8 59.6 ± 11.4 59.1 ± 09.7 32.5 ± 15.4 BoW 34.3 ± 13.7 51.5 ± 14.1 54.4 ± 10.9 31.8 ± 16.1 MLP 1-hot 34.7 ± 12.7 55.2 ± 11.1 55.7 ± 09.7 30.7 ± 15.2 BoW 35.2 ± 13.1 56.1 ± 11.8 56.4 ± 10.0 32.0 ± 15.0 SVM 1-hot 39.5 ± 12.2 58.6 ± 12.0 58.5 ± 10.0 33.4 ± 15.6 BoW 38.1 ± 13.1 58.3 ± 12.1 58.2 ± 10.2 33.4 ± 15.6 kNN IDs 35.6 ± 13.1 52.52 ± 12.8 54.5 ± 10.4 30.4 ± 15.6 BoW 34.3 ± 13.6 52.10 ± 13.2 54.0 ± 10.8 30.1 ± 15.7 RBM-CF 1-hot 30.9 ± 13.8 50.4 ± 13.48 52.6 ± 10.9 31.0 ± 15.3 29 / 36
  22. Plugin recommendation on multiple positions 1 2 3 4 5

    # of missing plugins in the patch 40.0 45.0 50.0 55.0 60.0 Hit@5 RBM-CF kNN LR MF 30 / 36
  23. Plugins positioning in a given configuration with the RBM-CF Model

    2 3 4 5 6 # of missing entries 60.0 70.0 80.0 90.0 100.0 Accuracy 31 / 36
  24. Full configuration recommendation with the RBM-CF Generate random configurations Sampling

    (0) (0) 1 0 0 0 0 0 Embedding (1000) 1 0 0 0 0 0 ... (1) 1 0 0 0 0 0 • Distortion ) Equalizer ) ZNR • ZNR ) Sim. Guitar Acoustic ) Equalizer ) Delay • ZNR ) Compressor ) Equalizer ) ZNR ) Amp. Simulator ) Reverb 32 / 36
  25. Full configuration recommendation with the RBM-CF Generate random configurations Sampling

    (0) (0) 1 0 0 0 0 0 Embedding (1000) 1 0 0 0 0 0 ... (1) 1 0 0 0 0 0 • Distortion ) Equalizer ) ZNR • ZNR ) Sim. Guitar Acoustic ) Equalizer ) Delay • ZNR ) Compressor ) Equalizer ) ZNR ) Amp. Simulator ) Reverb 32 / 36
  26. Final considerations • Define a methodology to use ML to

    support decision making by music actors in three cases; • Classifiers: • best results; • necessary a exponential trained models to cover all missing plugins cases7; • Collaborative filtering models: • a unique model for all cases; • generative model: generation of full configurations and choose best plugins positioning. Future works • Adding external information: demographics, plugin categories (other taxonomies [3]); • Use RBM-CF latent representation for data embedding. • Application in similar problems, e.g, Pok´ emon team selection. 7It requires M i=0 i M i models for a configuration with M plugins instances. 34 / 36
  27. References I [1] Spyridon Stasis, Nicholas Jillings, Sean Enderby, and

    Ryan Stables. Audio processing chain recommendation. In Proceedings of the 20th International Conference on Digital Audio Effects,(Edinburgh, UK), 2017. [2] Dell Zhang, Jun Wang, Deng Cai, and Jinsong Lu. Self-taught hashing for fast similarity search. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’10, pages 18–25, New York, NY, USA, 2010. ACM. [3] Vincent Verfaille, Catherine Guastavino, and Caroline Traube. An interdisciplinary approach to audio effect classification. In Proc. Of the 9th Int. Conference on Digital Audio Effects. Citeseer, 2006. 36 / 36