Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[論文紹介]Predicting Category Accesses for a User

ysekky
July 15, 2014

[論文紹介]Predicting Category Accesses for a User

ysekky

July 15, 2014
Tweet

More Decks by ysekky

Other Decks in Research

Transcript

  1. [論文紹介]      Predic*ng  Category  Accesses  for  User    

    in  a  Structured  Informa*on  Space     Mao  Chen,  Andrea  S.  LaPaugh,  Jaswinder  Pal  Singh   (Princeton  University)   SIGIR  2002 Yoshifumi  Seki   2014.07.15  @Gunosy研究会
  2. 概要 •  カテゴリ分けされたどのコンテンツにユーザ がアクセスするかを予測する   •  Cited  38   • 

    検索行動に関する論文が多いのだけれど, アプリなどによるUIのシンプル化からカテゴリ 時代の手法って有効になるんじゃないだろう か  
  3. 1:  Scoring  Categories  by  user’s  current  preferences •  Infer  a

     user’s  preference  for  a  category  from   the  user’s  accesses  of  that  category   •  Two  hypothesis   – User’s  access  history  is  reflects  a  user’s  current   preference   – User’s  interest  may  shiX  over  *me
  4. Episode  Forma*on •  Episode   –  A  set  of  consecu*ve

     accesses   –  Contain  more  than  one  “task”   •  Task  structure  is  same  as  category  structure   –  The  category  structure  is  designed  to  facilitate  browsing   •  Sta*c  Approach   –  Episode  is  to  include  all  the  access   •  A  user  always  has  only  one  long-­‐term  goal   –  Par**on  the  access  history   •  Adap*ve  Approach   –  Defining  the  boundary  betweeb  episodes   –  On  the  *me  interval  between  two  consecu*ve  accesses  
  5. Episode  Analysis •  Existence  analysis   –  All  category  accessed

     in  an  episode  are  given  equal  preference   score  regardless  of  when  and  how  oXen  they  are  accessed   •  Frequency  analysis   –  Counts  the  number  of  accesses  to  every  category  in  an  episode   as  the  category’s  preference  score   •  Recency  analysis   –  Each  access  in  an  episode  a  weight  according  the  age  of  access.   •  Sequen*al  analysis   –  User  repeats  similar  traversal  path  from  *me  to  *me.  Under   this  assump*on,  a  user’s  access  aXer  a  sequence  of  accesses   can  be  inferred  from  the  old  known  paths  that  include  the   sequence  
  6. •  Personal-­‐Nth-­‐Order-­‐Markov(PN)   – Sta*c(Fixed)  &  Seq.   •  Last-­‐N-­‐Accesses(LN)  

    – Sta*c(Fixed)  &  Exis.   •  Fixed-­‐Episode-­‐Interval(FEI)   – The  boundary  between  episode  is  set  by  a  fixed   interval,  an  approach  popularly  used  to  determine   sessions  from  server  log.     – Each  category  is  scored  by  the  access  frequency   during  the  last  episode   – Adp(Fixed)  &  Freq  
  7. •  Whole  History(WH)   –  Sta*c(All)  &  Freq   • 

    Past  Days(PD)   –  Sta*c(Fixed)  &  Freq   •  Time-­‐Weighted(TW)   –  Every  access  weoghted  by  the  func*on  of  access  age.   –  Sta*c(All)  &  Recen   •  Adap*ve-­‐Episode-­‐Interval(AEI)   –  The  boundary  between  episodes  is  set  using   “Adap*ve-­‐Time-­‐Out”   –  Adp(Adp)  &  Freq   •  Collabora*ve-­‐Nth-­‐Order-­‐Markov(CN)   –  Transi*on  matrix  is  built  from  the  traversals  of  the   whole  user  community.   –  Sta*c(Fixed  size)  &  Seq.  
  8. 2:  Genera*ng  predic*ons •  A  hierarchical  structure  has  two  kind

     of   topology  informa*on   – Layer   – Granularity  
  9. Experiment •  Epnions   – Online  shipping  site   •  Summer

     of  2000   •  256,899  ra*ngs   – 38,010  users   – 53,756  items
  10. •  Adap*ve-­‐Episode-­‐Inteval  and  Time-­‐Weighted   have  bejer  predi*on  quality  

    •  Time  Weighted  has  similar  precision  as   Adap*ve-­‐Episode-­‐Interval,  but  it  is  superior  to   the  lajer  in  coverage  at  fine-­‐grained  category   level   •  The  Makov  model  that  uses  only  personal   naviga*on  informa*on  has  poor  precision  and   coverage.  The  one  that  uses  all  users  accesses   has  much  bejer  quality.  But  s*ll  less  accurate   than  the  two  method