(Dn , k) A (D2 , k) A (D1 , k) X:,2 X:,1 X:,n … 2 6 6 6 6 6 4 1 1 0 · · · 1 1 0 1 0 · · · 0 0 1 0 1 · · · 1 1 . . . . . . . . . ... . . . . . . 1 1 1 · · · 1 1 3 7 7 7 7 7 5 # features # of runs Reduce & Inference X i Xj,i ⇣crit ! ! if feature is relevant j X Λ(Z) = P(T(Z)|H1 ) P(T(Z)|H0 ) H1 ≷ H0 ζcrit → n z pz 1 (1 − p1 )n−z n z pz 0 (1 − p0 )n−z H1 ≷ H0 ζcrit α = P(T(Z) > ζcrit |H0 ) 1G. Ditzler, R. Polikar, and G. Rosen, “A bootstrap based Neyman-Pearson test for identifying variable importance,” IEEE Transactions on Neural Networks and Learning Systems, 2014. EESI Group Meeting (February 2015) An Introduction to MapReduce