$30 off During Our Annual Pro Sale. View Details »

HypTrails: A Bayesian Approach for Comparing Hypotheses about Human Trails on the Web

HypTrails: A Bayesian Approach for Comparing Hypotheses about Human Trails on the Web

Talk about HypTrails at WWW 2016

Philipp Singer

April 20, 2016
Tweet

More Decks by Philipp Singer

Other Decks in Science

Transcript

  1. GESIS  -­‐  Leibniz  Ins.tute  for  the  Social  Sciences  
    HypTrails:  A  Bayesian  Approach  for  Comparing  
    Hypotheses  about  Human  Trails  on  the  Web  
    Philipp  Singer,  Denis  Helic,  Andreas  Hotho  
    and  Markus  Strohmaier  
    www.philippsinger.info/hyptrails  
     

    View Slide

  2. Vannevar  Bush  
    2  
    15.05.15   HypTrails  -­‐  Philipp  Singer  
    image courtesy of brucesterling on Flickr
    Bush, V. (1945). As we may think. The Atlantic
    Monthly, 176(1):101– 108. Bush, V. (1945).
    As we may think. The Atlantic Monthly, 176(1):
    101– 108.
    “[The  human  brain]  operates  by  associa5on.    
    With  one  item  in  its  grasp,  it  snaps  instantly  to  the  
     next  that  is  suggested  by  the  associa5on  of  thoughts.”  

    View Slide

  3. Human  trails  on  the  Web  
    15.05.15   HypTrails  -­‐  Philipp  Singer   3  
    image courtesy of user Mmxx on Wikipedia

    View Slide

  4. Human  trails  on  the  Web  
    18.05.15   HypTrails  -­‐  Philipp  Singer   4  
    image courtesy of user Mmxx on Wikipedia
    ?  
    ?  
    ?  
    ?  
    ?  
    What  are  the  mechanisms  
    producing  human  trails  on  
    the  Web?  

    View Slide

  5. Example:  Human  navigaRonal  trails  
    •  Humans  prefer  to  navigate  …  
    –  H1:  over  semanRcally  similar  websites    
    –  H2:  via  self-­‐loops  (e.g.,  refreshing)    
    –  H3:  by  using  the  structural  link  network  
    –  H4:  by  preferring  similar  categories  
    –  H5:  by  uRlizing  structural  properRes  
    –  H6:  by  informaRon  scent  
     
    [West  et  al.  IJCAI  2009],  [Singer  et  al.  IJSWIS  2013],  [West  &  Leskovec  WWW  2012],  [Chi  et  al.  CHI  
    2001]  
    18.05.15   HypTrails  -­‐  Philipp  Singer   5  

    View Slide

  6. Example:  Human  navigaRonal  trails  
    •  Humans  prefer  to  navigate  …  
    –  H1:  over  semanRcally  similar  websites    
    –  H2:  via  self-­‐loops  (e.g.,  refreshing)    
    –  H3:  by  using  the  structural  link  network  
    –  H4:  by  preferring  similar  categories  
    –  H5:  by  uRlizing  structural  properRes  
    –  H6:  by  informaRon  scent  
     
    [West  et  al.  IJCAI  2009],  [Singer  et  al.  IJSWIS  2013],  [West  &  Leskovec  WWW  2012],  [Chi  et  al.  CHI  
    2001]  
    18.05.15   HypTrails  -­‐  Philipp  Singer   6  
    What  is  the  relaRve  
    plausibility  of  these  
    hypotheses  given  data?  

    View Slide

  7. HypTrails  in  a  nutshell  
    •  Goal:  Express  and  compare  hypotheses  about  human  trails  
    in  a  coherent  research  approach  
     
    •  Method:    
    –  First-­‐order  Markov  chain  model  
    –  Bayesian  inference  
     
    •  Idea:    
    –  Incorporate  hypotheses  as  priors  
    –  URlize  sensiRvity  of  marginal  likelihood  on  the  prior  
    •  Outcome:  ParRal  ordering  of  hypotheses  
    15.05.15   HypTrails  -­‐  Philipp  Singer   7  

    View Slide

  8. Markov  chain  model  
    •  StochasRc  model  
    •  TransiRon  probabiliRes  between  states  
    15.05.15   HypTrails  -­‐  Philipp  Singer   8  
    0
    B
    B
    B
    @
    p1,1 p1,2 . . . p1,j
    p2,1 p2,2 . . . p2,j
    .
    .
    .
    .
    .
    .
    ...
    .
    .
    .
    pi,1 pi,2 . . . pi,j
    1
    C
    C
    C
    A
    S1  
    S2   S3  
    1/2   1/2  
    1/3  
    2/3  
    1  

    View Slide

  9. Structure  of  HypTrails  
    16.05.15   HypTrails  -­‐  Philipp  Singer   9  
    MC  Model  

    View Slide

  10. How  to  express  hypotheses?  
    15.05.15   HypTrails  -­‐  Philipp  Singer   10  

    View Slide

  11. Structural  hypothesis  
    15.05.15   HypTrails  -­‐  Philipp  Singer   11  
    1/3  
    1  
    1/3  
    1  
    1/3  

    View Slide

  12. Uniform  hypothesis  
    15.05.15   HypTrails  -­‐  Philipp  Singer   12  
    1/3  

    View Slide

  13. Empirical  observaRons  
    15.05.15   HypTrails  -­‐  Philipp  Singer   13  
    1.0  
    2/3  
    1/3  
    1  

    View Slide

  14. Structure  of  HypTrails  
    16.05.15   HypTrails  -­‐  Philipp  Singer   14  
    MC  Model  
    Hypothesis  
    (H1)  
    Belief  in  parameters  

    View Slide

  15. Which  hypothesis  is    
    the  most  plausible  one?  
    15.05.15   HypTrails  -­‐  Philipp  Singer   15  

    View Slide

  16. Bayesian  model  comparison:  
    Marginal  likelihood  
    15.05.15   HypTrails  -­‐  Philipp  Singer   16  
    Probability  of  data  given  hypothesis  
    =  Model  evidence  

    View Slide

  17. Bayesian  model  comparison:  
    Marginal  likelihood  
    15.05.15   HypTrails  -­‐  Philipp  Singer   17  
    Probability  of  data  given  hypothesis  
    Model  evidence  
    Parameters  are  marginalized  out    
    Probability  of  observing  data  
    given  parameters  and  hypothesis  

    View Slide

  18. Bayesian  model  comparison:  
    Marginal  likelihood  
    15.05.15   HypTrails  -­‐  Philipp  Singer   18  
    Probability  of  data  given  hypothesis  
    Model  evidence  
    Parameters  are  marginalized  out    
    Probability  of  observing  data  
    given  parameters  and  hypothesis   Probability  of  parameters  
    before  observing  data  

    View Slide

  19. Bayesian  model  comparison:  
    Marginal  likelihood  
    15.05.15   HypTrails  -­‐  Philipp  Singer   19  
    Probability  of  data  given  hypothesis  
    Model  evidence  
    Parameters  are  marginalized  out    
    Probability  of  observing  data  
    given  parameters  and  hypothesis   Probability  of  parameters  
    before  observing  data  
    Hypothesis  

    View Slide

  20. Structure  of  HypTrails  
    16.05.15   HypTrails  -­‐  Philipp  Singer   20  
    MC  Model  
    Hypothesis  
    (H1)  
    Belief  in  parameters  
    Prior  (H1)  
    ElicitaRon  
    Data  (Trails)  
    Marginal  
    likelihood  (H1)  
    Influence  
    Influence  

    View Slide

  21. How  to  elicit  priors  from  hypotheses?  
    15.05.15   HypTrails  -­‐  Philipp  Singer   21  

    View Slide

  22. EliciRng  priors  
    •  (Trial)  roulefe  method  
     
    20.05.15   HypTrails  -­‐  Philipp  Singer   22  

    View Slide

  23. •  (Trial)  roulefe  method  
     
    EliciRng  priors  
    20.05.15   HypTrails  -­‐  Philipp  Singer   23  

    View Slide

  24. •  (Trial)  roulefe  method  
     
    Prior  distribuRon  
    EliciRng  priors  
    20.05.15   HypTrails  -­‐  Philipp  Singer   24  

    View Slide

  25. Conjugate  Dirichlet  prior  
     
    •  Hyperparameters  à  pseudo  counts  
    15.05.15   HypTrails  -­‐  Philipp  Singer   25  
    0
    B
    B
    B
    @
    p1,1 p1,2 . . . p1,j
    p2,1 p2,2 . . . p2,j
    .
    .
    .
    .
    .
    .
    ...
    .
    .
    .
    pi,1 pi,2 . . . pi,j
    1
    C
    C
    C
    A
    MC  parameters  
    0
    B
    B
    B
    @
    ↵1,1 ↵1,2 . . . ↵1,j
    ↵2,1 ↵2,2 . . . ↵2,j
    .
    .
    .
    .
    .
    .
    ...
    .
    .
    .
    ↵i,1 ↵i,2 . . . ↵i,j
    1
    C
    C
    C
    A
    Dirichlet  hyperparameters  

    View Slide

  26. EliciRng  priors  from  hypotheses    
    about  human  trails  
    •  AdapRon  of  (trial)  roulefe  method  
    15.05.15   HypTrails  -­‐  Philipp  Singer   26  
    #Chips  =  k  
    Strength  of  hypothesis    
    k  =  18  

    View Slide

  27. EliciRng  priors  from  hypotheses    
    about  human  trails  
    •  AdapRon  of  (trial)  roulefe  method  
    16.05.15   HypTrails  -­‐  Philipp  Singer   27  
    #Chips  =  k  
    Strength  of  hypothesis    
    k  =  18  
    à  Dirichlet  hyperparameters  

    View Slide

  28. Example:  Structural  hypothesis    
    15.05.15   HypTrails  -­‐  Philipp  Singer   28  
    1/3  
    1  
    1/3  
    1  
    1/3  

    View Slide

  29. Example:  Structural  hypothesis    
    19.05.15   HypTrails  -­‐  Philipp  Singer   29  
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h1
    0.00 0.33 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1
    Input  
    Hypothesis  
    Output  
    Dirichlet  prior  

    View Slide

  30. Example:  Structural  hypothesis    
    15.05.15   HypTrails  -­‐  Philipp  Singer   30  
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h1
    0.00 0.33 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1

    View Slide

  31. Example:  Structural  hypothesis    
    19.05.15   HypTrails  -­‐  Philipp  Singer   31  
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h1
    0.00 0.33 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1

    View Slide

  32. Example:  Structural  hypothesis    
    19.05.15   HypTrails  -­‐  Philipp  Singer   32  
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h1
    0.00 0.33 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1
    α
    i
    1
    2
    3 α i
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h1
    0.00 0.99 0.00 h1
    h3
    h1
    h1

    View Slide

  33. Structure  of  HypTrails  
    16.05.15   HypTrails  -­‐  Philipp  Singer   33  
    MC  Model  
    Hypothesis  
    (H1)  
    Prior  (H1)  
    Data  (Trails)  
    Marginal  
    likelihood  (H1)  
    Hypothesis  
    (H2)  
    Prior  (H2)  
    Marginal  
    likelihood  (H2)  
    Compare  

    View Slide

  34. DemonstraRon  of  general  applicability  
    •  SyntheRc  data  
     
    •  Human  song  trails  (Last.fm)  
    •  Human  review  trails  (Yelp)  
    •  Human  naviga.on  trails  (Wikigame)  
    15.05.15   HypTrails  -­‐  Philipp  Singer   34  

    View Slide

  35. Wikigame  
    15.05.15   HypTrails  -­‐  Philipp  Singer   35  
    0 1 2 3 4
    hypothesis weighting factor k
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    −1.05
    −1.00
    −0.95
    evidence
    1e8
    uniform
    self-loop
    structural
    similarity
    Higher  
    plausibility  
    Higher  belief  
    (more  chips)  

    View Slide

  36. Wikigame  
    15.05.15   HypTrails  -­‐  Philipp  Singer   36  
    0 1 2 3 4
    hypothesis weighting factor k
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    −1.05
    −1.00
    −0.95
    evidence
    1e8
    uniform
    self-loop
    structural
    similarity

    View Slide

  37. Summary  
    •  Studying  mechanisms  producing  human  trails  
    •  HypTrails:  A  coherent  approach  for  expressing  and    
    comparing  hypotheses  about  human  trails  
    •  Can  be  applied  to  all  kinds  of  human  trails  
    •  ImplementaRons:  www.philippsinger.info/hyptrails  
    19.05.15   HypTrails  -­‐  Philipp  Singer   37  

    View Slide

  38. GESIS  -­‐  Leibniz  Ins.tute  for  the  Social  Sciences  
    for  your  afenRon!  
     
    @ph_singer  
    www.philippsinger.info  
    T  
    H  
    A  
    N  
    K  
    S  
    www.philippsinger.info/hyptrails  
     

    View Slide

  39. References  1/2  
    •  [West  et  al.  WWW  2015]    
    –  Robert  West,  Ashwin  Paranjape,  and  Jure  Leskovec:  Mining  Missing  Hyperlinks  from  Human  
    NavigaRon  Traces:  A  Case  Study  of  Wikipedia.  24th  InternaRonal  World  Wide  Web  Conference  
    (WWW'15),  Florence,  Italy,  2015.  
    •  [De  Choudhury  et  al.  HT  2010]  
    –  De  Choudhury,  Munmun  and  Feldman,  Moran  and  Amer-­‐Yahia,  Sihem  and  Golbandi,  Nadav  and  
    Lempel,  Ronny  and  Yu,  Cong:  AutomaRc  construcRon  of  travel  iRneraries  using  social  breadcrumbs.  
    21st  ACM  conference  on  Hypertext  and  hypermedia,  2010.  
    •  [Bestavros  CIKM  1995]  
    –  Bestavros,  Azer:  Using  speculaRon  to  reduce  server  load  and  service  Rme  on  the  WWW.”  4th  InternaRonal  
    conference  on  InformaRon  and  knowledge  management.  1995.  
    •  [Perkowitz  IJCAI  1997]  
    –  Perkowitz,  Mike,  and  Oren  Etzioni:  AdapRve  web  sites:  an  AI  challenge.  15th  internaRonal  
    joint  conference  on  ArRfical  intelligence.  1997.  
    •  [West  et  al.  IJCAI  2009]    
    –  West,  Robert,  Joelle  Pineau,  and  Doina  Precup.  "Wikispeedia:  An  Online  Game  for  Inferring  SemanRc  
    Distances  between  Concepts."  IJCAI.  2009.  
    15.05.15   HypTrails  -­‐  Philipp  Singer   39  

    View Slide

  40. References  2/2  
    •  [Singer  et  al.  IJSWIS  2013]  
    –  Philipp  Singer,  Thomas  Niebler,  Markus  Strohmaier  and  Andreas  Hotho,  CompuRng  SemanRc  
    Relatedness  from  Human  NavigaRonal  Paths:  A  Case  Study  on  Wikipedia,  InternaRonal  Journal  on  
    SemanRc  Web  and  InformaRon  Systems  (IJSWIS),  vol  9(4),  41-­‐70,  2013  
    •  [West  &  Leskovec  WWW  2012]  
    –  Robert  West  and  Jure  Leskovec:  Human  Wayfinding  in  InformaRon  Networks  21st  InternaRonal  
    World  Wide  Web  Conference  (WWW'12),  pp.  619–628,  Lyon,  France,  2012.  
    •  [Chi  et  al.  CHI  2001]  
    –  Chi,  Ed  H.,  et  al.  "Using  informaRon  scent  to  model  user  informaRon  needs  and  acRons  and  the  
    Web."  Proceedings  of  the  SIGCHI  conference  on  Human  factors  in  compuRng  systems.  ACM,  2001.  
    15.05.15   HypTrails  -­‐  Philipp  Singer   40  

    View Slide

  41. Dirichlet  distribuRon:  Simplex  
    20.05.15   HypTrails  -­‐  Philipp  Singer   41  

    View Slide