Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Comparing Hypotheses about Human Trails on the Web

Comparing Hypotheses about Human Trails on the Web

Invited talk at the Artificial Intelligence Seminars at the Information Sciences Institute USC

Philipp Singer

April 20, 2016
Tweet

More Decks by Philipp Singer

Other Decks in Research

Transcript

  1. GESIS - Leibniz Institute for the Social Sciences
    Philipp Singer, Denis Helic, Andreas Hotho
    and Markus Strohmaier
    Comparing Hypotheses about
    Human Trails on the Web
    HypTrails: A Bayesian Approach for Comparing
    Hypotheses about Human Trails on the Web
    WWW 2015

    View full-size slide

  2. Vannevar Bush
    2
    16.09.2015 HypTrails - Philipp Singer
    image courtesy of brucesterling on Flickr
    Bush, V. (1945). As we may think. The Atlantic
    Monthly, 176(1):101– 108. Bush, V. (1945).
    As we may think. The Atlantic Monthly,
    176(1):101– 108.
    “[The human brain] operates by association.
    With one item in its grasp, it snaps instantly to the
    next that is suggested by the association of thoughts.”

    View full-size slide

  3. Human trails on the Web
    16.09.2015 HypTrails - Philipp Singer 3
    image courtesy of user Mmxx on Wikipedia

    View full-size slide

  4. Human trails on the Web
    16.09.2015 HypTrails - Philipp Singer 4
    image courtesy of user Mmxx on Wikipedia
    ?
    ?
    ?
    ?
    ?
    What are the mechanisms
    producing human trails on
    the Web?

    View full-size slide

  5. Example: Human navigational trails
    • Humans prefer to navigate …
    – H1: over semantically similar websites
    – H2: via self-loops (e.g., refreshing)
    – H3: by using the structural link network
    – H4: by preferring similar categories
    – H5: by utilizing structural properties
    – H6: by information scent
    [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012],
    [Chi et al. CHI 2001]
    16.09.2015 HypTrails - Philipp Singer 5

    View full-size slide

  6. Example: Human navigational trails
    • Humans prefer to navigate …
    – H1: over semantically similar websites
    – H2: via self-loops (e.g., refreshing)
    – H3: by using the structural link network
    – H4: by preferring similar categories
    – H5: by utilizing structural properties
    – H6: by information scent
    [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI
    2001]
    16.09.2015 HypTrails - Philipp Singer 6
    What is the relative
    plausibility of these
    hypotheses given data?

    View full-size slide

  7. HypTrails in a nutshell
    • Goal: Express and compare hypotheses about human trails
    in a coherent research approach
    • Method:
    – First-order Markov chain model
    – Bayesian inference
    • Idea:
    – Incorporate hypotheses as priors
    – Utilize sensitivity of marginal likelihood on the prior
    • Outcome: Partial ordering of hypotheses
    16.09.2015 HypTrails - Philipp Singer 7

    View full-size slide

  8. Structure of HypTrails
    16.09.2015 HypTrails - Philipp Singer 8

    View full-size slide

  9. Markov chain model
    • Stochastic model
    • Transition probabilities between states
    16.09.2015 HypTrails - Philipp Singer 9
    S1
    S2 S3
    1/2 1/2
    1/3
    2/3
    1

    View full-size slide

  10. Structure of HypTrails
    16.09.2015 HypTrails - Philipp Singer 10
    MC Model
    S1
    S2 S3
    1/2 1/2
    1/3
    2/3
    1

    View full-size slide

  11. How to express hypotheses?
    16.09.2015 HypTrails - Philipp Singer 11

    View full-size slide

  12. Structural hypothesis
    16.09.2015 HypTrails - Philipp Singer 12
    1/3
    1
    1/3
    1
    1/3

    View full-size slide

  13. Uniform hypothesis
    16.09.2015 HypTrails - Philipp Singer 13
    1/3

    View full-size slide

  14. Structure of HypTrails
    16.09.2015 HypTrails - Philipp Singer 14
    MC Model
    Hypothesis
    (H1)
    Belief in parameters
    f chips
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h2
    0.00 0.33 0.00 h1
    h3
    h2
    h1
    0.00 0.99 0.00
    3.01 0.99 3.01
    0.00 0.99 0.00
    h3
    h2
    h1

    View full-size slide

  15. Empirical observations
    16.09.2015 HypTrails - Philipp Singer 15
    1.0
    2/3
    1/3
    1

    View full-size slide

  16. Which hypothesis is
    the most plausible one?
    16.09.2015 HypTrails - Philipp Singer 16

    View full-size slide

  17. Bayesian inference
    18.09.2015 HypTrails - Philipp Singer 17
    Posterior = ℎ ∗

    View full-size slide

  18. Bayesian model comparison:
    Marginal likelihood
    16.09.2015 HypTrails - Philipp Singer 18
    Probability of data given hypothesis
    = Model evidence

    View full-size slide

  19. Bayesian model comparison:
    Marginal likelihood
    16.09.2015 HypTrails - Philipp Singer 19
    Probability of data given hypothesis
    Model evidence
    Parameters are marginalized out
    Probability of observing data
    given parameters and hypothesis

    View full-size slide

  20. Bayesian model comparison:
    Marginal likelihood
    16.09.2015 HypTrails - Philipp Singer 20
    Probability of data given hypothesis
    Model evidence
    Parameters are marginalized out
    Probability of observing data
    given parameters and hypothesis Probability of parameters
    before observing data

    View full-size slide

  21. Bayesian model comparison:
    Marginal likelihood
    16.09.2015 HypTrails - Philipp Singer 21
    Probability of data given hypothesis
    Model evidence
    Parameters are marginalized out
    Probability of observing data
    given parameters and hypothesis Probability of parameters
    before observing data
    Hypothesis

    View full-size slide

  22. Structure of HypTrails
    16.09.2015 HypTrails - Philipp Singer 22
    MC Model
    Hypothesis
    (H1)
    Belief in parameters
    Prior (H1)
    Elicitation
    Data (Trails)
    Marginal
    likelihood (H1)
    Influence
    Influence

    View full-size slide

  23. How to elicit priors from hypotheses?
    16.09.2015 HypTrails - Philipp Singer 23

    View full-size slide

  24. Eliciting priors
    • (Trial) roulette method
    16.09.2015 HypTrails - Philipp Singer 24

    View full-size slide

  25. • (Trial) roulette method
    Eliciting priors
    16.09.2015 HypTrails - Philipp Singer 25

    View full-size slide

  26. • (Trial) roulette method
    Prior distribution
    Eliciting priors
    16.09.2015 HypTrails - Philipp Singer 26

    View full-size slide

  27. Conjugate Dirichlet prior
    • Hyperparameters  pseudo counts
    16.09.2015 HypTrails - Philipp Singer 27
    MC parameters Dirichlet hyperparameters

    View full-size slide

  28. Eliciting priors from hypotheses
    about human trails
    • Adaption of (trial) roulette method
    16.09.2015 HypTrails - Philipp Singer 28
    #Chips = k
    Strength of hypothesis
    k = 18

    View full-size slide

  29. Eliciting priors from hypotheses
    about human trails
    • Adaption of (trial) roulette method
    16.09.2015 HypTrails - Philipp Singer 29
    #Chips = k
    Strength of hypothesis
    k = 18
     Dirichlet hyperparameters

    View full-size slide

  30. Example: Structural hypothesis
    16.09.2015 HypTrails - Philipp Singer 30
    1/3
    1
    1/3
    1
    1/3

    View full-size slide

  31. Example: Structural hypothesis
    16.09.2015 HypTrails - Philipp Singer 31
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h2
    0.00 0.33 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1
    Input
    Hypothesis
    Output
    Dirichlet prior

    View full-size slide

  32. Example: Structural hypothesis
    16.09.2015 HypTrails - Philipp Singer 32
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h2
    0.00 0.33 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1

    View full-size slide

  33. Example: Structural hypothesis
    16.09.2015 HypTrails - Philipp Singer 33
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h2
    0.00 0.33 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1

    View full-size slide

  34. Example: Structural hypothesis
    16.09.2015 HypTrails - Philipp Singer 34
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.33 0.00 h3
    1.00 0.33 1.00 h2
    0.00 0.33 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    3.01 0.99 3.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1
    ®
    i
    1
    2
    3 ®j
    1
    2
    3
    nr. of chips
    1
    2
    3
    0.00 0.99 0.00 h3
    0.01 0.99 0.01 h2
    0.00 0.99 0.00 h1
    h3
    h2
    h1

    View full-size slide

  35. Structure of HypTrails
    16.09.2015 HypTrails - Philipp Singer 35
    MC Model
    Hypothesis
    (H1)
    Dirichlet Prior
    (H1)
    Data (Trails)
    Marginal
    likelihood (H1)
    Hypothesis
    (H2)
    Dirichlet Prior
    (H2)
    Marginal
    likelihood (H2)
    Compare
    Belief in parameters
    Elicitation
    Influence
    Influence

    View full-size slide

  36. Examples
    16.09.2015 HypTrails - Philipp Singer 36

    View full-size slide

  37. Wikigame – Navigation Trails
    17.09.2015 HypTrails - Philipp Singer 37
    0 1 2 3 4
    hypothesis weighting factor k
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    −1.05
    −1.00
    −0.95
    evidence
    1e8
    uniform
    self-loop
    structural
    similarity
    Higher
    plausibility
    Higher belief

    View full-size slide

  38. Wikigame – Navigation Trails
    16.09.2015 HypTrails - Philipp Singer 38
    0 1 2 3 4
    hypothesis weighting factor k
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    −1.05
    −1.00
    −0.95
    evidence
    1e8
    uniform
    self-loop
    structural
    similarity

    View full-size slide

  39. Last.fm – Song Trails
    16.09.2015 HypTrails - Philipp Singer 39
    0 1 2 3 4
    hypothesis weighting factor k
    −1.55
    −1.50
    −1.45
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    evidence
    1e5
    uniform
    self-loop
    track date
    similarity

    View full-size slide

  40. Yelp – Review Trails
    16.09.2015 HypTrails - Philipp Singer 40
    0 1 2 3 4
    hypothesis weighting factor k
    −1.30
    −1.28
    −1.26
    −1.24
    −1.22
    −1.20
    −1.18
    −1.16
    −1.14
    −1.12
    evidence
    1e7
    uniform
    self-loop
    geographic
    similarity

    View full-size slide

  41. Empirical studies
    16.09.2015 HypTrails - Philipp Singer 41

    View full-size slide

  42. Flickr - Photo Trails
    17.09.2015 HypTrails - Philipp Singer 42
    Photowalking the city: Comparing hypotheses about urban photo trails on Flickr
    Martin Becker, Philipp Singer, Florian Lemmerich, Andreas Hotho,
    Denis Helic and Markus Strohmaier; under review

    View full-size slide

  43. Flickr - Photo Trails
    17.09.2015 HypTrails - Philipp Singer 43

    View full-size slide

  44. Ontology Engineering – Edit Trails
    17.09.2015 HypTrails - Philipp Singer 44
    Understanding How Users
    Edit Ontologies: Comparing
    Hypotheses About Four
    Real-World-Projects
    Simon Walk, Philipp Singer,
    Lisette Espín Noboa, Tania
    Tudorache, Mark A. Musen
    and Markus Strohmaier

    View full-size slide

  45. Summary
    • Studying mechanisms producing human trails
    • HypTrails: A coherent approach for expressing and
    comparing hypotheses about human trails
    • Can be applied to all kinds of human trails
    • Tutorial: www.philippsinger.info/hyptrails
    16.09.2015 HypTrails - Philipp Singer 45

    View full-size slide

  46. Events in Cologne
    • CSS Winter
    Symposium
    – Dec 1-3
    • ICWSM 2015
    – May 17-20
    17.09.2015 HypTrails - Philipp Singer 46
    image courtesy of user Jiuguang Wang on Wikipedia

    View full-size slide

  47. Events in Cologne
    • CSS Winter
    Symposium
    – Dec 1-3
    • ICWSM 2015
    – May 17-20
    17.09.2015 HypTrails - Philipp Singer 47
    images courtesy of user Jiuguang Wang on Wikipedia and the “Privatbrauerei Gaffel Becker & Co” as also derived from Wikipedia.

    View full-size slide

  48. GESIS - Leibniz Institute for the Social Sciences
    for your attention!
    @ph_singer
    www.philippsinger.info
    T
    H
    A
    N
    K
    S
    www.philippsinger.info/hyptrails

    View full-size slide

  49. References 1/2
    • [West et al. WWW 2015]
    – Robert West, Ashwin Paranjape, and Jure Leskovec: Mining Missing Hyperlinks from Human
    Navigation Traces: A Case Study of Wikipedia. 24th International World Wide Web Conference
    (WWW'15), Florence, Italy, 2015.
    • [De Choudhury et al. HT 2010]
    – De Choudhury, Munmun and Feldman, Moran and Amer-Yahia, Sihem and Golbandi, Nadav and
    Lempel, Ronny and Yu, Cong: Automatic construction of travel itineraries using social breadcrumbs.
    21st ACM conference on Hypertext and hypermedia, 2010.
    • [Bestavros CIKM 1995]
    – Bestavros, Azer: Using speculation to reduce server load and service time on the WWW.” 4th International conference
    on Information and knowledge management. 1995.
    • [Perkowitz IJCAI 1997]
    – Perkowitz, Mike, and Oren Etzioni: Adaptive web sites: an AI challenge. 15th international joint
    conference on Artifical intelligence. 1997.
    • [West et al. IJCAI 2009]
    – West, Robert, Joelle Pineau, and Doina Precup. "Wikispeedia: An Online Game for Inferring Semantic
    Distances between Concepts." IJCAI. 2009.
    16.09.2015 HypTrails - Philipp Singer 49

    View full-size slide

  50. References 2/2
    • [Singer et al. IJSWIS 2013]
    – Philipp Singer, Thomas Niebler, Markus Strohmaier and Andreas Hotho, Computing Semantic
    Relatedness from Human Navigational Paths: A Case Study on Wikipedia, International Journal on
    Semantic Web and Information Systems (IJSWIS), vol 9(4), 41-70, 2013
    • [West & Leskovec WWW 2012]
    – Robert West and Jure Leskovec: Human Wayfinding in Information Networks 21st International
    World Wide Web Conference (WWW'12), pp. 619–628, Lyon, France, 2012.
    • [Chi et al. CHI 2001]
    – Chi, Ed H., et al. "Using information scent to model user information needs and actions and the
    Web." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2001.
    16.09.2015 HypTrails - Philipp Singer 50

    View full-size slide

  51. Dirichlet distribution: Simplex
    16.09.2015 HypTrails - Philipp Singer 51

    View full-size slide

  52. Utilization of Bayesian inference and
    marginal likelihoods
    16.09.2015 HypTrails - Philipp Singer 52
    Probability of data given hypothesis
    Model evidence
    Parameters are marginalized out

    View full-size slide