論文読んだ「Simple and Deterministic Matrix Sketching」

Slide 1

Slide 1 text

Simple and Deterministic Matrix Sketching Edo Liberty (Yahoo! Labs ) (KDD 2013, best research paper) 高柳慎一 @_stakaya 論文読んだ

Slide 2

Slide 2 text

本日のお持ち帰り • 近似精度の保証付で行列圧縮法を提案 • n x m の行列Aを l x mの行列Bへと圧縮 – l << n • 以下の量（分散的なもの）を良く近似させる • 理論保証あり 2

Slide 3

Slide 3 text

Matrix Sketching • Matrix Sketchingでは列ではなく行を削除 –皆大好きPCAは列を削除 • 直観的には頻出ベクトルを残す操作 –行空間（Not特徴量（列）空間）の基底探し的な –k-meansのクラスタ中心を残すイメージ • Frequent Directionという手法を提案 3

Slide 4

Slide 4 text

Algorithm 4 O(nml)でいける（SVDがO(ml^2)で, n/(2/l)回やるので）各行に対して逐次ループ

Slide 5

Slide 5 text

理論的な上限 • || ⋅ || はフロベニウスノルム • || ⋅ ||はスペクトルノルム（多分） • 削減する次元 lを決めると自動的に上限が決まる（うれしい） 5

Slide 6

Slide 6 text

その他のご利益 • オンライン更新OK –各行に対して独⽴に実行可能（for … in Algorithm) • 並列可 –行列Aを部分行列にばらして処理→その後Merge 6

Slide 7

Slide 7 text

Before we start • よく使う等式を示す – は行列Aの各行 –はiループでの行列B –はiループでの行列C • は−1 の全0な行にを挿入してVでぐるっと回転させてるだけなので 7

Slide 8

Slide 8 text

上限の証明-1 8 xはATA − BTBの最大固有値の固有ベクトル ∑で 0, 以外は全部消える 0は零行列なのでやっぱりいらん前Pの等式を代入 Algorithmで定義したΣを各々代入 − の真ん中の（気持ち）√取ってシュワルツの不等式行列ノルムの定義に従って頑張るアルゴリズムのここを見る

Slide 9

Slide 9 text

上限の証明-2 9 よりも大きいものが少なくともl/2個アルゴリズム的に残っているはずなので成⽴

Slide 10

Slide 10 text

上限の証明 • 上限の証明-1, 2を組み合わせればOK 10

Slide 11

Slide 11 text

やってみる • Python版は数年前にPFNのHido氏がやってる 11 https://www.slideshare.net/shoheihido/kdd-25788780

Slide 12

Slide 12 text

やってみる • 完全にHidoさんを真似る –同じと思われるデータがKaggleに転がってた 12 https://www.kaggle.com/bistaumanga/usps-dataset/version/1

Slide 13

Slide 13 text

やってみる • R版がない（と思う） –ないならば、作って見せよう、ホクソエム 13

Slide 14

Slide 14 text

つくった 14 https://github.com/shinichi-takayanagi/frequent-directions

Slide 15

Slide 15 text

インストール 15 devtools::install_github("shinichi-takayanagi/frequentdirections")

Slide 16

Slide 16 text

データを読む • Kaggleからサンプルデータを落としてからの 16 library("h5") file <- h5file("C:¥¥temp¥¥usps.h5") x <- scale(file["train/data"][]) y <- file["train/target"][] > x[1:5, 1:6] [,1] [,2] [,3] [,4] [,5] [,6] [1,] -0.0692796 -0.124749 -0.1999769 -0.3113933 -0.4506665 -0.6198367 [2,] -0.0692796 -0.124749 -0.1999769 0.2073071 0.2038526 -0.3160401 [3,] -0.0692796 -0.124749 -0.1999769 -0.3113933 -0.4506665 -0.6198367 [4,] -0.0692796 -0.124749 -0.1999769 -0.3113933 -0.4506665 0.5364991 [5,] -0.0692796 -0.124749 -0.1999769 -0.3113933 -0.4506665 -0.5053164

Slide 17

Slide 17 text

データを読む 17 image(matrix(x[338,], nrow=16, byrow = FALSE))

Slide 18

Slide 18 text

とりあえずSVDで 18 frequentdirections::plot_svd(x, y)

Slide 19

Slide 19 text

俺俺Matrix Sketching(l=8) 19 eps <- 10^(-8) frequentdirections::plot_svd(x, y, frequentdirections::sketching(x, 8, eps))

Slide 20

Slide 20 text

俺俺Matrix Sketching (l=16) 20 frequentdirections::plot_svd(x, y, frequentdirections::sketching(x, 16, eps))

Slide 21

Slide 21 text

俺俺Matrix Sketching (l=32) 21 frequentdirections::plot_svd(x, y, frequentdirections::sketching(x, 32, eps))

Slide 22

Slide 22 text

俺俺Matrix Sketching (l=64) 22 frequentdirections::plot_svd(x, y, frequentdirections::sketching(x, 64, eps))

Slide 23

Slide 23 text

俺俺Matrix Sketching (l=128) 23 frequentdirections::plot_svd(x, y, frequentdirections::sketching(x, 128, eps))

Slide 24

Slide 24 text

まとめ • 近似精度の保証付で行列圧縮法をお勉強した • 理論保証があって嬉しい • Rのパッケージ作った –上手く行ってるような気もするがなんか違う気もする –単体テストとかCRANは次にやる 24