Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Comparing Hypotheses about Human Trails on the Web

Comparing Hypotheses about Human Trails on the Web

Invited talk at the Artificial Intelligence Seminars at the Information Sciences Institute USC

6dd5a1c14ac7614e279cb2a3ea112790?s=128

Philipp Singer

April 20, 2016
Tweet

Transcript

  1. GESIS - Leibniz Institute for the Social Sciences Philipp Singer,

    Denis Helic, Andreas Hotho and Markus Strohmaier Comparing Hypotheses about Human Trails on the Web HypTrails: A Bayesian Approach for Comparing Hypotheses about Human Trails on the Web WWW 2015
  2. Vannevar Bush 2 16.09.2015 HypTrails - Philipp Singer image courtesy

    of brucesterling on Flickr Bush, V. (1945). As we may think. The Atlantic Monthly, 176(1):101– 108. Bush, V. (1945). As we may think. The Atlantic Monthly, 176(1):101– 108. “[The human brain] operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts.”
  3. Human trails on the Web 16.09.2015 HypTrails - Philipp Singer

    3 image courtesy of user Mmxx on Wikipedia
  4. Human trails on the Web 16.09.2015 HypTrails - Philipp Singer

    4 image courtesy of user Mmxx on Wikipedia ? ? ? ? ? What are the mechanisms producing human trails on the Web?
  5. Example: Human navigational trails • Humans prefer to navigate …

    – H1: over semantically similar websites – H2: via self-loops (e.g., refreshing) – H3: by using the structural link network – H4: by preferring similar categories – H5: by utilizing structural properties – H6: by information scent [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001] 16.09.2015 HypTrails - Philipp Singer 5
  6. Example: Human navigational trails • Humans prefer to navigate …

    – H1: over semantically similar websites – H2: via self-loops (e.g., refreshing) – H3: by using the structural link network – H4: by preferring similar categories – H5: by utilizing structural properties – H6: by information scent [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001] 16.09.2015 HypTrails - Philipp Singer 6 What is the relative plausibility of these hypotheses given data?
  7. HypTrails in a nutshell • Goal: Express and compare hypotheses

    about human trails in a coherent research approach • Method: – First-order Markov chain model – Bayesian inference • Idea: – Incorporate hypotheses as priors – Utilize sensitivity of marginal likelihood on the prior • Outcome: Partial ordering of hypotheses 16.09.2015 HypTrails - Philipp Singer 7
  8. Structure of HypTrails 16.09.2015 HypTrails - Philipp Singer 8

  9. Markov chain model • Stochastic model • Transition probabilities between

    states 16.09.2015 HypTrails - Philipp Singer 9 S1 S2 S3 1/2 1/2 1/3 2/3 1
  10. Structure of HypTrails 16.09.2015 HypTrails - Philipp Singer 10 MC

    Model S1 S2 S3 1/2 1/2 1/3 2/3 1
  11. How to express hypotheses? 16.09.2015 HypTrails - Philipp Singer 11

  12. Structural hypothesis 16.09.2015 HypTrails - Philipp Singer 12 1/3 1

    1/3 1 1/3
  13. Uniform hypothesis 16.09.2015 HypTrails - Philipp Singer 13 1/3

  14. Structure of HypTrails 16.09.2015 HypTrails - Philipp Singer 14 MC

    Model Hypothesis (H1) Belief in parameters f chips 2 3 0.00 0.33 0.00 h3 1.00 0.33 1.00 h2 0.00 0.33 0.00 h1 h3 h2 h1 0.00 0.99 0.00 3.01 0.99 3.01 0.00 0.99 0.00 h3 h2 h1
  15. Empirical observations 16.09.2015 HypTrails - Philipp Singer 15 1.0 2/3

    1/3 1
  16. Which hypothesis is the most plausible one? 16.09.2015 HypTrails -

    Philipp Singer 16
  17. Bayesian inference 18.09.2015 HypTrails - Philipp Singer 17 Posterior =

    ℎ ∗ ℎ
  18. Bayesian model comparison: Marginal likelihood 16.09.2015 HypTrails - Philipp Singer

    18 Probability of data given hypothesis = Model evidence
  19. Bayesian model comparison: Marginal likelihood 16.09.2015 HypTrails - Philipp Singer

    19 Probability of data given hypothesis Model evidence Parameters are marginalized out Probability of observing data given parameters and hypothesis
  20. Bayesian model comparison: Marginal likelihood 16.09.2015 HypTrails - Philipp Singer

    20 Probability of data given hypothesis Model evidence Parameters are marginalized out Probability of observing data given parameters and hypothesis Probability of parameters before observing data
  21. Bayesian model comparison: Marginal likelihood 16.09.2015 HypTrails - Philipp Singer

    21 Probability of data given hypothesis Model evidence Parameters are marginalized out Probability of observing data given parameters and hypothesis Probability of parameters before observing data Hypothesis
  22. Structure of HypTrails 16.09.2015 HypTrails - Philipp Singer 22 MC

    Model Hypothesis (H1) Belief in parameters Prior (H1) Elicitation Data (Trails) Marginal likelihood (H1) Influence Influence
  23. How to elicit priors from hypotheses? 16.09.2015 HypTrails - Philipp

    Singer 23
  24. Eliciting priors • (Trial) roulette method 16.09.2015 HypTrails - Philipp

    Singer 24
  25. • (Trial) roulette method Eliciting priors 16.09.2015 HypTrails - Philipp

    Singer 25
  26. • (Trial) roulette method Prior distribution Eliciting priors 16.09.2015 HypTrails

    - Philipp Singer 26
  27. Conjugate Dirichlet prior • Hyperparameters  pseudo counts 16.09.2015 HypTrails

    - Philipp Singer 27 MC parameters Dirichlet hyperparameters
  28. Eliciting priors from hypotheses about human trails • Adaption of

    (trial) roulette method 16.09.2015 HypTrails - Philipp Singer 28 #Chips = k Strength of hypothesis k = 18
  29. Eliciting priors from hypotheses about human trails • Adaption of

    (trial) roulette method 16.09.2015 HypTrails - Philipp Singer 29 #Chips = k Strength of hypothesis k = 18  Dirichlet hyperparameters
  30. Example: Structural hypothesis 16.09.2015 HypTrails - Philipp Singer 30 1/3

    1 1/3 1 1/3
  31. Example: Structural hypothesis 16.09.2015 HypTrails - Philipp Singer 31 ®

    i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.33 0.00 h3 1.00 0.33 1.00 h2 0.00 0.33 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 3.01 0.99 3.01 h2 0.00 0.99 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 0.01 0.99 0.01 h2 0.00 0.99 0.00 h1 h3 h2 h1 Input Hypothesis Output Dirichlet prior
  32. Example: Structural hypothesis 16.09.2015 HypTrails - Philipp Singer 32 ®

    i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.33 0.00 h3 1.00 0.33 1.00 h2 0.00 0.33 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 3.01 0.99 3.01 h2 0.00 0.99 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 0.01 0.99 0.01 h2 0.00 0.99 0.00 h1 h3 h2 h1
  33. Example: Structural hypothesis 16.09.2015 HypTrails - Philipp Singer 33 ®

    i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.33 0.00 h3 1.00 0.33 1.00 h2 0.00 0.33 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 3.01 0.99 3.01 h2 0.00 0.99 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 0.01 0.99 0.01 h2 0.00 0.99 0.00 h1 h3 h2 h1
  34. Example: Structural hypothesis 16.09.2015 HypTrails - Philipp Singer 34 ®

    i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.33 0.00 h3 1.00 0.33 1.00 h2 0.00 0.33 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 3.01 0.99 3.01 h2 0.00 0.99 0.00 h1 h3 h2 h1 ® i 1 2 3 ®j 1 2 3 nr. of chips 1 2 3 0.00 0.99 0.00 h3 0.01 0.99 0.01 h2 0.00 0.99 0.00 h1 h3 h2 h1
  35. Structure of HypTrails 16.09.2015 HypTrails - Philipp Singer 35 MC

    Model Hypothesis (H1) Dirichlet Prior (H1) Data (Trails) Marginal likelihood (H1) Hypothesis (H2) Dirichlet Prior (H2) Marginal likelihood (H2) Compare Belief in parameters Elicitation Influence Influence
  36. Examples 16.09.2015 HypTrails - Philipp Singer 36

  37. Wikigame – Navigation Trails 17.09.2015 HypTrails - Philipp Singer 37

    0 1 2 3 4 hypothesis weighting factor k −1.40 −1.35 −1.30 −1.25 −1.20 −1.15 −1.10 −1.05 −1.00 −0.95 evidence 1e8 uniform self-loop structural similarity Higher plausibility Higher belief
  38. Wikigame – Navigation Trails 16.09.2015 HypTrails - Philipp Singer 38

    0 1 2 3 4 hypothesis weighting factor k −1.40 −1.35 −1.30 −1.25 −1.20 −1.15 −1.10 −1.05 −1.00 −0.95 evidence 1e8 uniform self-loop structural similarity
  39. Last.fm – Song Trails 16.09.2015 HypTrails - Philipp Singer 39

    0 1 2 3 4 hypothesis weighting factor k −1.55 −1.50 −1.45 −1.40 −1.35 −1.30 −1.25 −1.20 −1.15 −1.10 evidence 1e5 uniform self-loop track date similarity
  40. Yelp – Review Trails 16.09.2015 HypTrails - Philipp Singer 40

    0 1 2 3 4 hypothesis weighting factor k −1.30 −1.28 −1.26 −1.24 −1.22 −1.20 −1.18 −1.16 −1.14 −1.12 evidence 1e7 uniform self-loop geographic similarity
  41. Empirical studies 16.09.2015 HypTrails - Philipp Singer 41

  42. Flickr - Photo Trails 17.09.2015 HypTrails - Philipp Singer 42

    Photowalking the city: Comparing hypotheses about urban photo trails on Flickr Martin Becker, Philipp Singer, Florian Lemmerich, Andreas Hotho, Denis Helic and Markus Strohmaier; under review
  43. Flickr - Photo Trails 17.09.2015 HypTrails - Philipp Singer 43

  44. Ontology Engineering – Edit Trails 17.09.2015 HypTrails - Philipp Singer

    44 Understanding How Users Edit Ontologies: Comparing Hypotheses About Four Real-World-Projects Simon Walk, Philipp Singer, Lisette Espín Noboa, Tania Tudorache, Mark A. Musen and Markus Strohmaier
  45. Summary • Studying mechanisms producing human trails • HypTrails: A

    coherent approach for expressing and comparing hypotheses about human trails • Can be applied to all kinds of human trails • Tutorial: www.philippsinger.info/hyptrails 16.09.2015 HypTrails - Philipp Singer 45
  46. Events in Cologne • CSS Winter Symposium – Dec 1-3

    • ICWSM 2015 – May 17-20 17.09.2015 HypTrails - Philipp Singer 46 image courtesy of user Jiuguang Wang on Wikipedia
  47. Events in Cologne • CSS Winter Symposium – Dec 1-3

    • ICWSM 2015 – May 17-20 17.09.2015 HypTrails - Philipp Singer 47 images courtesy of user Jiuguang Wang on Wikipedia and the “Privatbrauerei Gaffel Becker & Co” as also derived from Wikipedia.
  48. GESIS - Leibniz Institute for the Social Sciences for your

    attention! @ph_singer www.philippsinger.info T H A N K S www.philippsinger.info/hyptrails
  49. References 1/2 • [West et al. WWW 2015] – Robert

    West, Ashwin Paranjape, and Jure Leskovec: Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia. 24th International World Wide Web Conference (WWW'15), Florence, Italy, 2015. • [De Choudhury et al. HT 2010] – De Choudhury, Munmun and Feldman, Moran and Amer-Yahia, Sihem and Golbandi, Nadav and Lempel, Ronny and Yu, Cong: Automatic construction of travel itineraries using social breadcrumbs. 21st ACM conference on Hypertext and hypermedia, 2010. • [Bestavros CIKM 1995] – Bestavros, Azer: Using speculation to reduce server load and service time on the WWW.” 4th International conference on Information and knowledge management. 1995. • [Perkowitz IJCAI 1997] – Perkowitz, Mike, and Oren Etzioni: Adaptive web sites: an AI challenge. 15th international joint conference on Artifical intelligence. 1997. • [West et al. IJCAI 2009] – West, Robert, Joelle Pineau, and Doina Precup. "Wikispeedia: An Online Game for Inferring Semantic Distances between Concepts." IJCAI. 2009. 16.09.2015 HypTrails - Philipp Singer 49
  50. References 2/2 • [Singer et al. IJSWIS 2013] – Philipp

    Singer, Thomas Niebler, Markus Strohmaier and Andreas Hotho, Computing Semantic Relatedness from Human Navigational Paths: A Case Study on Wikipedia, International Journal on Semantic Web and Information Systems (IJSWIS), vol 9(4), 41-70, 2013 • [West & Leskovec WWW 2012] – Robert West and Jure Leskovec: Human Wayfinding in Information Networks 21st International World Wide Web Conference (WWW'12), pp. 619–628, Lyon, France, 2012. • [Chi et al. CHI 2001] – Chi, Ed H., et al. "Using information scent to model user information needs and actions and the Web." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2001. 16.09.2015 HypTrails - Philipp Singer 50
  51. Dirichlet distribution: Simplex 16.09.2015 HypTrails - Philipp Singer 51

  52. Utilization of Bayesian inference and marginal likelihoods 16.09.2015 HypTrails -

    Philipp Singer 52 Probability of data given hypothesis Model evidence Parameters are marginalized out