Multiple Kernel Learning

Multiple Kernel Learning (MKL) Matthew Barga 15 November 2012 1
/ 9

1 Intro 2 Software Packages 3 Closing References 2 /
9

Multiple Kernel Learning Combine multiple kernels and use heterogeneous data
sources to improve classiﬁcation accuracy Target optimization problem is a set of linear combinations of ﬁxed kernel matrices that are computed beforehand K = i µi ∗ Ki 2 / 9

Multiple Kernel Learning Learning the Kernel Matrix .. (Lanckriet, Cristianini,
Bartlett, Ghaoui, &Jordan,2004) Used convex optimization (semideﬁnite programming) Large Scale Multiple Kernel Learning (Sonnenburg, Rtsch, Schfer, &Schlkopf ,2006) Produced multiple kernel learner SHOGUN with results for datasets as large as 1M samples with 20 kernels 3 / 9

SHOGUN Software suite including performance measures, normalizers, preprocessors and ML
methods Contains sets of linear classifiers and SVMs MKL methods, regression and classification Canonical data formats: 32-bit ints, 32bit - 96bit float matrix 4 / 9

SHOGUN Targeted primarily for applications in computational biology Very large
datasets up to 10M samples Caching limitation Explicitly compute function mapping when needed Dealing with high dimensional sparse vectors Restrict learning problem to semi-inﬁnite linear programming Use LP solver to evaluate 5 / 9

Biological Kernels Should represent biologically relevant notion of similarity Different
kernels for different situations/ datasets SVMs have advanced from analyzing vector based data to general forms (strings, etc.)s Resulting decision function difficult to interpret and use to extract knowledge SHOGUN claims to overcome this by having weights in convex combination of kernels Esp. if produce a sparse weighting, can easily interpret decision function Derive kernels from probabilistic models 6 / 9

Parallel Semideﬁnite Programming Prof. Fujisawa at TIT is most active
(Fujisawa etal.,2012), (Yamashita, Fujisawa, &Kojima,2003) Parallelization of necessary matrix factorization (Fujisawa etal.,2012) Similar to QP problem method SDPs are inherently sequential Current sparsity of available software (SDPs investigated since ˜2003) 7 / 9

Upcoming continue trying out SHOGUN look at SDP solver multiclass
(LIBSVM, SHOGUN, multigpu) 8 / 9

Cited Works Fujisawa, K., Sato, H., Matsuoka, S., Endo, T.,
Yamashita, M., & Nakata, M. (2012). High-performance general solver for extremely large-scale semidefinite programming problems. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 93). Lanckriet, G. R. G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. I. (2004). Learning the kernel matrix with semidefinite programming. The Journal of Machine Learning Research, 5, 27–72. Retrieved 2012-11-05, from http://dl.acm.org/citation.cfm?id=1005334 Sonnenburg, S., Rätsch, G., Schäfer, C., & Schölkopf, B. (2006). Large scale multiple kernel learning. The Journal of Machine Learning Research, 7, 1531–1565. Yamashita, M., Fujisawa, K., & Kojima, M. (2003). Sdpara: Semidefinite programming algorithm parallel version. 9 / 9

Multiple Kernel Learning

Multiple Kernel Learning

Matthew Barga

More Decks by Matthew Barga

Featured

Transcript

Multiple Kernel Learning (MKL) Matthew Barga 15 November 2012 1

1 Intro 2 Software Packages 3 Closing References 2 /

Multiple Kernel Learning Combine multiple kernels and use heterogeneous data

Multiple Kernel Learning Learning the Kernel Matrix .. (Lanckriet, Cristianini,

SHOGUN Software suite including performance measures, normalizers, preprocessors and ML

SHOGUN Targeted primarily for applications in computational biology Very large

Biological Kernels Should represent biologically relevant notion of similarity Diﬀerent

Parallel Semideﬁnite Programming Prof. Fujisawa at TIT is most active

Upcoming continue trying out SHOGUN look at SDP solver multiclass

Cited Works Fujisawa, K., Sato, H., Matsuoka, S., Endo, T.,