Exact Rule Learning Via Boolean Compressed Sensing

Slide 1

Slide 1 text

Exact Rule Learning via Boolean Compressed Learning Anil Shanbhag (100050082)

Slide 2

Slide 2 text

MoDvaDon •  OrganizaDons are turning to predicDve analysis to support decision making •  Output given to people with limited analyDcs, data and modeling literacy •  Hence, it is imperaDve to have interpretable machine learning methods so that predicDons are beLer adopted/trusted by decision makers.

Slide 3

Slide 3 text

Salespersons voluntarily resigning

Slide 4

Slide 4 text

Brief History •  Old methods rely on heurisDcs, bad in worst case in comparison to SVM •  Renewed interest in ﬁeld aRer Ruckert & Kramer(2008) – ﬁrst to introduce rule-‐learning based on objecDve •  Works of Gilbert(2008) highlight parallels between group tesDng and sparse signal recovery

Slide 5

Slide 5 text

ContribuDons The paper develops a new approach to interpretable supervised classiﬁcaDon through rule learning based on Boolean compressed sensing. The major contribuDons of the papers are: -‐ •  Show that problem of learning sparse conjuncDve/ disjuncDve clause rules from training samples can be formulated as a group tesDng problem •  Reduce the NP hard problem using relaxaDon to resemble the basic pursuit algorithm for sparse signal recovery •  Establish condiDons under which the relaxaDon recovers exactly.

Slide 6

Slide 6 text

Group TesDng Problem

Slide 7

Slide 7 text

Parameters •  y I is the value of pool i •  Aij is 0 or 1 based on whether subject j is part of ith group or not •  To Find: -‐ wi is the true value of subject i which should be recovered

Slide 8

Slide 8 text

Clause Learning •  Given a corpus of data of training samples {Xi, yi} where Xi belongs to X are the features and yi = 0 or 1 •  We would like to learn a funcDon to map Xi to yi •  Every classiﬁer is made of set of clauses, each clause contains a set of boolean terms

Slide 9

Slide 9 text

Clause Learning as Boolean TesDng •  Consider – Every boolean term as a subject – Every yi as the result of the pool – The matrix A is the boolean result of funcDon applied to Xi ie: Aij = aj(xi)

Slide 10

Slide 10 text

AND vs OR •  The given equaDons learn disjuncDve OR clause. To learn the AND clauses (which are preferred) we just need to make a small modiﬁcaDon :-‐

Slide 11

Slide 11 text

Boolean Compressed Sensing •  Compressed sensing is a signal processing technique for eﬀecDvely measuring and reconstrucDng signals •  There are clear similariDes, however we restrict boolean algebra instead of real algebra •  We apply similar techniques for formulaDon of problem and use LP relaxaDon from compressed sensing

Slide 12

Slide 12 text

Checkpoint1: WriDng as LP with l1

Slide 13

Slide 13 text

Checkpoint2: RelaxaDon of Boolean integer constraints

Slide 14

Slide 14 text

Slack to accommodate errors •  There may be no sparse rules to approximate labels of y exactly but may be for approximaDng y closely

Slide 15

Slide 15 text

Exact Recovery •  K-‐Disjunct : A measurement matrix A is K-‐ disjunct if the union of any K columns does not contain any other column of A •  If there exists a w* with K non-‐zero entries and matrix K-‐disjunct, then the LP recovers it exactly

Slide 16

Slide 16 text

•  A is (e, K) disjunct if out of (n C k) K subsets, (1-‐e) fracDon of them saDsfy the property that union does not contain any other column •  If matrix A is (e,K) disjunct, then LP recover the correct soluDon with probablity 1-‐e

Slide 17

Slide 17 text

ConDnuous Features •  For conDnuous features we choose thresholds suitably spread across the domain •  T1 <= T2 <= T3 <= T4 … <= Tn •  Each threshold value leads to two indicator funcDons : I( xj >= T1 ) & I( xj < T1 )

Slide 18

Slide 18 text

Learning Rule Sets •  We use the Set Covering Approach •  This is a common technique •  First step is to learn an AND rule (here using ideas from boolean sensing) •  Once we know one AND rule, remove all training samples which are idenDﬁed by the rule •  Now repeat the learning on remaining rules •  This leads to DNF

Slide 19

Slide 19 text

EvaluaDng the approach •  I have tried to evaluate the approach used in the paper on IRIS dataset. •  IRIS dataset is a set of 150 tuples. There are four features : sepal width, sepal length, petal length, petal width. Each tuple is indicaDve of an iris ﬂower. There are three types of ﬂowers :-‐ setosa, versicolor, virginica

Slide 20

Slide 20 text

Three rules learnt •  Petal length <= 5.4 •  Petal Width <= 1.7 •  Petal Width > 0.9

Slide 21

Slide 21 text

References •  Malioutov, Dmitry M., and Kush R. Varshney, Exact Rule Learning via Boolean Compressed Sensing. •  Malioutov, D. and Malyutov, M. Boolean compressed sensing: LP relaxaDon for group tesDng. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 3305–3308, Kyoto, Japan, Mar. 2012.