Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

Exact Rule Learning Via Boolean Compressed Sensing

Anil Shanbhag
September 09, 2013

Exact Rule Learning Via Boolean Compressed Sensing

Present a novel approach to exact rule learning driven by an objective using techniques from compressed sensing and group testing

Anil Shanbhag

September 09, 2013
Tweet

More Decks by Anil Shanbhag

Other Decks in Research

Transcript

  1. MoDvaDon   •  OrganizaDons  are  turning  to  predicDve   analysis

     to  support  decision  making   •  Output  given  to  people  with  limited  analyDcs,   data  and  modeling  literacy   •  Hence,  it  is  imperaDve  to  have  interpretable   machine  learning  methods  so  that  predicDons   are  beLer  adopted/trusted  by  decision   makers.    
  2. Brief  History   •  Old  methods  rely  on  heurisDcs,  bad

     in  worst  case   in  comparison  to  SVM   •  Renewed  interest  in  field  aRer  Ruckert  &   Kramer(2008)  –  first  to  introduce  rule-­‐learning   based  on  objecDve   •  Works  of  Gilbert(2008)  highlight  parallels   between  group  tesDng  and  sparse  signal  recovery  
  3. ContribuDons   The  paper  develops  a  new  approach  to  interpretable

      supervised  classificaDon  through  rule  learning  based  on   Boolean  compressed  sensing.       The  major  contribuDons  of  the  papers  are:  -­‐     •  Show  that  problem  of  learning  sparse  conjuncDve/   disjuncDve  clause  rules  from  training  samples  can  be   formulated  as  a  group  tesDng  problem   •  Reduce  the  NP  hard  problem  using  relaxaDon  to  resemble   the  basic  pursuit  algorithm  for  sparse  signal  recovery     •  Establish  condiDons  under  which  the  relaxaDon  recovers   exactly.    
  4. Parameters     •  y  I  is  the  value  of

     pool  i   •  Aij  is  0  or  1  based  on  whether  subject  j  is  part   of  ith  group  or  not       •  To  Find:  -­‐  wi  is  the  true  value  of  subject  i  which   should  be  recovered  
  5. Clause  Learning   •  Given  a  corpus  of  data  of

     training  samples       {Xi,  yi}  where  Xi  belongs  to  X  are  the  features   and  yi  =  0  or  1   •  We  would  like  to  learn  a  funcDon  to                           map  Xi  to  yi   •  Every  classifier  is  made  of  set  of  clauses,  each   clause  contains  a  set  of  boolean  terms  
  6. Clause  Learning  as  Boolean  TesDng   •  Consider    

    – Every  boolean  term  as  a  subject   – Every  yi  as  the  result  of  the  pool   – The  matrix  A  is  the  boolean  result  of  funcDon   applied  to  Xi  ie:  Aij  =  aj(xi)  
  7. AND  vs  OR   •  The  given  equaDons  learn  disjuncDve

     OR   clause.  To  learn  the  AND  clauses  (which  are   preferred)  we  just  need  to  make  a  small   modificaDon  :-­‐  
  8. Boolean  Compressed  Sensing   •  Compressed  sensing  is  a  signal

     processing   technique  for  effecDvely  measuring  and   reconstrucDng  signals   •  There  are  clear  similariDes,  however  we  restrict   boolean  algebra  instead  of  real  algebra     •  We  apply  similar  techniques  for  formulaDon  of   problem  and  use  LP  relaxaDon  from  compressed   sensing  
  9. Slack  to  accommodate  errors   •  There  may  be  no

     sparse  rules  to  approximate   labels  of  y  exactly  but  may  be  for   approximaDng  y  closely  
  10. Exact  Recovery   •  K-­‐Disjunct  :  A  measurement  matrix  A

     is  K-­‐ disjunct  if  the  union  of  any  K  columns  does   not  contain  any  other  column  of  A     •  If  there  exists  a  w*  with  K  non-­‐zero  entries   and  matrix  K-­‐disjunct,  then  the  LP  recovers  it   exactly    
  11. •  A  is  (e,  K)  disjunct  if  out  of  (n

     C  k)  K  subsets,   (1-­‐e)  fracDon  of  them  saDsfy  the  property  that   union  does  not  contain  any  other  column   •  If  matrix  A  is  (e,K)  disjunct,  then  LP  recover   the  correct  soluDon  with  probablity  1-­‐e  
  12. ConDnuous  Features   •  For  conDnuous  features  we  choose  thresholds

      suitably  spread  across  the  domain   •  T1  <=  T2  <=  T3  <=  T4  …  <=  Tn   •  Each  threshold  value  leads  to  two  indicator   funcDons  :  I(  xj  >=  T1  )  &  I(  xj  <  T1  )  
  13. Learning  Rule  Sets   •  We  use  the  Set  Covering

     Approach   •  This  is  a  common  technique   •  First  step  is  to  learn  an  AND  rule  (here  using  ideas   from  boolean  sensing)   •  Once  we  know  one  AND  rule,  remove  all  training   samples  which  are  idenDfied  by  the  rule   •  Now  repeat  the  learning  on  remaining  rules   •  This  leads  to  DNF  
  14. EvaluaDng  the  approach   •  I  have  tried  to  evaluate

     the  approach  used  in   the  paper  on  IRIS  dataset.   •  IRIS  dataset  is  a  set  of  150  tuples.  There  are   four  features  :  sepal  width,  sepal  length,  petal   length,  petal  width.  Each  tuple  is  indicaDve  of   an  iris  flower.  There  are  three  types  of   flowers  :-­‐  setosa,  versicolor,  virginica  
  15. Three  rules  learnt       •  Petal  length  <=

      5.4   •  Petal  Width  <=   1.7   •  Petal  Width  >  0.9  
  16. References     •  Malioutov,  Dmitry  M.,  and  Kush  R.

     Varshney,   Exact  Rule  Learning  via  Boolean  Compressed   Sensing.   •  Malioutov,  D.  and  Malyutov,  M.  Boolean   compressed  sensing:  LP  relaxaDon  for  group   tesDng.  In  Proc.  IEEE  Int.  Conf.  Acoust.  Speech   Signal  Process.,  pp.  3305–3308,  Kyoto,  Japan,   Mar.  2012.