Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multiple Kernel Learning

Matthew Barga
November 15, 2012
57

Multiple Kernel Learning

masters progress report

Matthew Barga

November 15, 2012
Tweet

Transcript

  1. Multiple Kernel Learning Combine multiple kernels and use heterogeneous data

    sources to improve classification accuracy Target optimization problem is a set of linear combinations of fixed kernel matrices that are computed beforehand K = i µi ∗ Ki 2 / 9
  2. Multiple Kernel Learning Learning the Kernel Matrix .. (Lanckriet, Cristianini,

    Bartlett, Ghaoui, &Jordan,2004) Used convex optimization (semidefinite programming) Large Scale Multiple Kernel Learning (Sonnenburg, Rtsch, Schfer, &Schlkopf ,2006) Produced multiple kernel learner SHOGUN with results for datasets as large as 1M samples with 20 kernels 3 / 9
  3. SHOGUN Software suite including performance measures, normalizers, preprocessors and ML

    methods Contains sets of linear classifiers and SVMs MKL methods, regression and classification Canonical data formats: 32-bit ints, 32bit - 96bit float matrix 4 / 9
  4. SHOGUN Targeted primarily for applications in computational biology Very large

    datasets up to 10M samples Caching limitation Explicitly compute function mapping when needed Dealing with high dimensional sparse vectors Restrict learning problem to semi-infinite linear programming Use LP solver to evaluate 5 / 9
  5. Biological Kernels Should represent biologically relevant notion of similarity Different

    kernels for different situations/ datasets SVMs have advanced from analyzing vector based data to general forms (strings, etc.)s Resulting decision function difficult to interpret and use to extract knowledge SHOGUN claims to overcome this by having weights in convex combination of kernels Esp. if produce a sparse weighting, can easily interpret decision function Derive kernels from probabilistic models 6 / 9
  6. Parallel Semidefinite Programming Prof. Fujisawa at TIT is most active

    (Fujisawa etal.,2012), (Yamashita, Fujisawa, &Kojima,2003) Parallelization of necessary matrix factorization (Fujisawa etal.,2012) Similar to QP problem method SDPs are inherently sequential Current sparsity of available software (SDPs investigated since ˜2003) 7 / 9
  7. Cited Works Fujisawa, K., Sato, H., Matsuoka, S., Endo, T.,

    Yamashita, M., & Nakata, M. (2012). High-performance general solver for extremely large-scale semidefinite programming problems. In Proceedings of the international conference on high performance computing, networking, storage and analysis (p. 93). Lanckriet, G. R. G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. I. (2004). Learning the kernel matrix with semidefinite programming. The Journal of Machine Learning Research, 5, 27–72. Retrieved 2012-11-05, from http://dl.acm.org/citation.cfm?id=1005334 Sonnenburg, S., Rätsch, G., Schäfer, C., & Schölkopf, B. (2006). Large scale multiple kernel learning. The Journal of Machine Learning Research, 7, 1531–1565. Yamashita, M., Fujisawa, K., & Kojima, M. (2003). Sdpara: Semidefinite programming algorithm parallel version. 9 / 9