Audio Plugin Recommendation Systems for Music Production

Audio Plugin Recommendation Systems for Music Production Paulo Mateus M.
da Silva – IFCE C´ esar Lincoln C. Mattos – UFC Amauri Holanda de S. Junior – IFCE

Index Introduction Related work Methodology Experiments Final considerations 2 /
36

Introduction 3 / 36

Motivation Input signal Output signal Effects units Music actors 4
/ 36

Motivation Input signal Output signal Effects units Music actors 5
/ 36

Motivation 6 / 36

Motivation A large number of available conﬁgurations implies that better
use of such equipment is restricted to specialists. slots in a configuration 6 possible plugins for each slot 117 possible configurations 6 117 7 / 36

Motivation Intelligent systems for music production aim to propose automatic
tools to ease and support decision making by music actors: 1. plugin suggestions based on the choices the user has already made; 2. generate full conﬁgurations (plugins selection), and; 3. deﬁne the plugins positioning. 8 / 36

Objective Recommendation systems for decision support. Two distinct approaches: •
Supervised learning with classiﬁcation models; • Collaborative ﬁltering. 9 / 36

Related work 10 / 36

Related work • Automatic parameter tuning to ﬁnd the most
suitable parameter values (e.g., gain, treble level) of a given plugin; • Plugins recommendations. 11 / 36

Related work – Plugins recommendations S Stasis et al: Audio
processing chain recommendation1. • Markov chains to model plugin sequences based on timbral descriptive terms, desired audio effects and music genre [1]; • Dataset: • 178 configurations collected with a web form answered by musical producers, where they had to define a chain of plugins to be applied to random audio samples2; • 4 audio plugins. 1S Stasis et al. Proceedings of the 20th International Conference on Digital Audio Effects. 2017 2Mixing Secrets – http://www.cambridge-mt.com/ms-mtk.htm 12 / 36

Methodology 13 / 36

The dataset of audio plugins Data extracted from Guitar Patches3
• Effects unit: Zoom G3; • 117 distinct plugins; • A configuration: 6 audio plugins; • Plugin None; • 11 categories; • 5,123 configurations ) 2,161 unique configurations. 3https://www.guitarpatches.com 14 / 36

Models Classification • Logistic Regression (LR); • Multilayer Perceptron (MLP);
• Support Vector Machine (SVM). Collaborative filtering • k-Nearest Neighbors (kNN); • Restricted Boltzmann Machine for Collaborative Filtering (RBM-CF)4. 4R. Salakhutdinov et al. Restricted boltzmann machines for collaborative filtering. ICML’07. 2007. 15 / 36

Collaborative ﬁltering user-based Automatic predictions about a user’s interests based
on users with a similar proﬁle. A B C ? ? Users Items “In a movie, what rating would the user give?” 16 / 36

Collaborative filtering in our case Automatic plugins predictions for a
configuration, based in similar configurations. “In a position/slot, which audio plugin would be most suitable?” 17 / 36

Data encoding 1 8 4 6 1 5 2 18
/ 36

Data encoding • One-hot M-dimensional to preserve plugins positioning, but
implies a large input dimension; • Bag-of-words small input dimension, but there are no information positioning. One-hot M-dimensional Bag-of-words 1 1 1 1 1 0 0 0 Configuration One-hot 8 4 6 1 5 2 (vector concatenation) (apply 'OR' binary operator) Class 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 19 / 36

Classiﬁcation: Application Encoding Attributes Class prediction Model ... 45,1% 15,3%
5,2% 5,1% 0,5% ? 3 7 2 11 9 ? 20 / 36

Classiﬁcation: Application As a classiﬁer is required for each position
to be recommended, it is necessary to generate separate dataset for each position. 8 4 6 1 5 71 8 4 6 1 2 8 4 6 5 2 8 4 1 5 2 8 48 6 1 5 2 56 4 6 1 5 2 Generate incomplete configurations Missing audio plugin 8 4 6 1 5 2 21 / 36

Collaborative ﬁltering: Application Applied models receive in the input all
the attributes. A unique model to obtain predictions! Attributes Encoding Model 8 4 6 1 5 2 22 / 36

Collaborative ﬁltering: kNN Hamming distance metric. 6 4 1 5
2 20 7 4 7 10 2 6 7 2 3 5 1 3 11 2 9 6 5 20 5 7 2 6 2 9 k most similar configurations kNN 20 7 3 9 40% ? 4 6 1 5 2 Recommendation list 20% 20% 20% 23 / 36

Collaborative ﬁltering: RBM-CF Recommendation list (0) (0) 1 0 0
0 1 0 1 0 0 0 0 0 0 1 0 (1) 24 / 36

Evaluation metrics 1. Accuracy: Expected item is in the ﬁrst
position? 2. Hit@5: Expected item is in the top 5? 3. MDCG5: How close is the item to the top positions? 4. MAP@56: Proportion of the top 5 recommended plugins that are of the same category of the expected item. 5Mean (Normalized) Discounted Cumulative Gain 6Mean Average Precision 25 / 36

Experiments 26 / 36

Experiments 1. 5-fold cross-validation: 4 folds for train and 1
for test; 2. Hyperparameters selection: 2-fold cross-validation using training data; 3. Generate of incomplete conﬁgurations; 8 4 6 1 5 71 8 4 6 1 2 8 4 6 5 2 8 4 1 5 2 8 48 6 1 5 2 56 4 6 1 5 2 Generate incomplete configurations Missing audio plugin 8 4 6 1 5 2 4. Evaluation: Average of the ﬁve training samples (generated by the cross-validation). 27 / 36

Hyperparameters search Table 1: Hyperparameters tested in the grid search
step. Model Hyperparameter Values LR Otimizer Coordinate Descent Class selection one-vs-all MLP # hidden layers 1 # hidden units 20, 40, 80, 100 SVM C 2−3, 20, · · · , 212 γ 2−13, 2−10, · · · , 22 Class selection one-vs-all kNN # neighbors 1, 5, · · · , 25, 40, 60, 80, 100 Distance metric Hamming distance [2] RBM-CF CD 1 # unit elements 10.000 Mini-batch size 10 Learning rate Adam(lr=0.05) # eppoch Deﬁned by early stopping 28 / 36

Plugin recommendation on a single position Table 2: Summary of
the experiments (averages and correspondent standard deviations) for recommendation on a single position. Best results are in boldface. Model Encod. Accur. (%) Hit@5 (%) MDCG (×10−2) MAP@5 (%) LR 1-hot 39.0 ± 12.8 59.6 ± 11.4 59.1 ± 09.7 32.5 ± 15.4 BoW 34.3 ± 13.7 51.5 ± 14.1 54.4 ± 10.9 31.8 ± 16.1 MLP 1-hot 34.7 ± 12.7 55.2 ± 11.1 55.7 ± 09.7 30.7 ± 15.2 BoW 35.2 ± 13.1 56.1 ± 11.8 56.4 ± 10.0 32.0 ± 15.0 SVM 1-hot 39.5 ± 12.2 58.6 ± 12.0 58.5 ± 10.0 33.4 ± 15.6 BoW 38.1 ± 13.1 58.3 ± 12.1 58.2 ± 10.2 33.4 ± 15.6 kNN IDs 35.6 ± 13.1 52.52 ± 12.8 54.5 ± 10.4 30.4 ± 15.6 BoW 34.3 ± 13.6 52.10 ± 13.2 54.0 ± 10.8 30.1 ± 15.7 RBM-CF 1-hot 30.9 ± 13.8 50.4 ± 13.48 52.6 ± 10.9 31.0 ± 15.3 29 / 36

Plugin recommendation on multiple positions 1 2 3 4 5
# of missing plugins in the patch 40.0 45.0 50.0 55.0 60.0 Hit@5 RBM-CF kNN LR MF 30 / 36

Plugins positioning in a given conﬁguration with the RBM-CF Model
2 3 4 5 6 # of missing entries 60.0 70.0 80.0 90.0 100.0 Accuracy 31 / 36

Full conﬁguration recommendation with the RBM-CF Generate random configurations Sampling
(0) (0) 1 0 0 0 0 0 Embedding (1000) 1 0 0 0 0 0 ... (1) 1 0 0 0 0 0 • Distortion ) Equalizer ) ZNR • ZNR ) Sim. Guitar Acoustic ) Equalizer ) Delay • ZNR ) Compressor ) Equalizer ) ZNR ) Amp. Simulator ) Reverb 32 / 36

Final considerations 33 / 36

Final considerations • Define a methodology to use ML to
support decision making by music actors in three cases; • Classifiers: • best results; • necessary a exponential trained models to cover all missing plugins cases7; • Collaborative filtering models: • a unique model for all cases; • generative model: generation of full configurations and choose best plugins positioning. Future works • Adding external information: demographics, plugin categories (other taxonomies [3]); • Use RBM-CF latent representation for data embedding. • Application in similar problems, e.g, Pok´ emon team selection. 7It requires M i=0 i M i models for a configuration with M plugins instances. 34 / 36

Thanks! Questions? Paulo Mateus Moura da Silva [email protected] 35 /
36

References I [1] Spyridon Stasis, Nicholas Jillings, Sean Enderby, and
Ryan Stables. Audio processing chain recommendation. In Proceedings of the 20th International Conference on Digital Audio Effects,(Edinburgh, UK), 2017. [2] Dell Zhang, Jun Wang, Deng Cai, and Jinsong Lu. Self-taught hashing for fast similarity search. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’10, pages 18–25, New York, NY, USA, 2010. ACM. [3] Vincent Verfaille, Catherine Guastavino, and Caroline Traube. An interdisciplinary approach to audio effect classification. In Proc. Of the 9th Int. Conference on Digital Audio Effects. Citeseer, 2006. 36 / 36

Audio Plugin Recommendation Systems for Music Production

Audio Plugin Recommendation Systems for Music Production

More Decks by Paulo Mateus

Other Decks in Science

Featured

Transcript