we have a linear objective to maximize in order to get a combined kernel matrix Target optimization problem is a set of linear combinations of fixed kernel matrices that are computed beforehand K = i µi Ki , µ ≥ 0, K 0 2 / 12
cholesky factorization in multiple kernel learning SVM In the software packages I have come across, all simply mark the factorization as failed if a negative value is encountered in the radical SHOGUN uses cholesky factorization from “Numerical Recipes in C++” Why? 3 / 12
(depending on application) rather, large number libraries or routines to deal with large dimensional data are needed more precision needed in some SVM variations (regression, etc.) SHOGUN supports up to 96-bit floating point precision 4 / 12
O((m + ntr )2n2) With added restriction to µ, the optimization problem becomes a QCQP problem, which is a special case of SOCP Complexity is O(mn3) SDPARA used similar constraints to achieve an equivalent complexity 6 / 12
structure: sparsity patterns (some automatic optimization in solvers for this) low rank: schur complement can be formed more efficiently algebraic symmetry: data matrices belong to matrix *-algebra 8 / 12
most targeted areas of SDP optimization There are two large exploitations that are most commonly used 1 a specific type of constraint that when removed, greatly simplifies calculation through use of Lagrangian relaxation, the optimization problem is decomposed into smaller ones that are solved independently 2 low rank structure in basis matrices (from the input data) by exploiting this structure can efficiently determine search direction 9 / 12
SVM multikernel (same computational complexity) Understanding nature of translation between SVM and SDP/QCQP Automated analysis and recognition of patterns in input data? How does exploitation relate to memory optimization/ communication? Application domains Alternatives to SDP: replacement of SDP with other optimizations bypass the solver; optimize the classifier and kernel combination factors together 11 / 12