Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Comparing Sequential Hypotheses with HypTrails

Comparing Sequential Hypotheses with HypTrails

Philipp Singer

April 20, 2016
Tweet

More Decks by Philipp Singer

Other Decks in Education

Transcript

  1. Part 4
    Comparing Hypotheses about
    Sequential Data

    View full-size slide

  2. 2
    Example: Human Navigation

    Humans prefer to navigate…
    – H1: over semantically similar websites
    – H2: via self-loops (e.g., refreshing)
    – H3: by using the structural link network
    – H4: by preferring similar categories
    – H5: by utilizing structural properties
    – H6: by information scent
    [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013],
    [West & Leskovec WWW 2012], [Chi et al. CHI 2001]

    View full-size slide

  3. 3
    Example: Human Navigation

    Humans prefer to navigate…
    – H1: over semantically similar websites
    – H2: via self-loops (e.g., refreshing)
    – H3: by using the structural link network
    – H4: by preferring similar categories
    – H5: by utilizing structural properties
    – H6: by information scent
    [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013],
    [West & Leskovec WWW 2012], [Chi et al. CHI 2001]
    What is the relative
    plausibility of these
    hypotheses given data?

    View full-size slide

  4. 4
    Example: Human Navigation

    Humans prefer to navigate…
    – H1: over semantically similar websites
    – H2: via self-loops (e.g., refreshing)
    – H3: by using the structural link network
    – H4: by preferring similar categories
    – H5: by utilizing structural properties
    – H6: by information scent
    [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013],
    [West & Leskovec WWW 2012], [Chi et al. CHI 2001]
    HypTrails
    [Singer et al. WWW 2015]

    View full-size slide

  5. 5
    HypTrails in a nutshell

    Goal: Express and compare hypotheses about sequences
    in a coherent research approach

    Method:
    – First-order Markov chain model
    – Bayesian inference

    Idea:
    – Incorporate hypotheses as priors
    – Utilize sensitivity of marginal likelihood on the prior

    Outcome: Partial ordering of hypotheses

    View full-size slide

  6. Structure of HypTrails

    View full-size slide

  7. 7
    Structure of HypTrails
    MC Model
    MC Model S1
    S1
    S2
    S2 S3
    S3
    1/2 1/2
    1/3
    2/3
    1

    View full-size slide

  8. How to express hypotheses?

    View full-size slide

  9. How to express hypotheses?
    As assumptions in parameters of
    Markov Chain model.

    View full-size slide

  10. 10
    Structural hypothesis
    1/3
    1
    1/3
    1
    1/3

    View full-size slide

  11. 11
    Uniform hypothesis
    1/3

    View full-size slide

  12. 12
    Structure of HypTrails
    MC Model
    MC Model
    Hypothesis
    (H1)
    Hypothesis
    (H1)
    Belief in parameters
    1
    2
    3
    0.00 0.33 0.00 h
    3
    1.00 0.33 1.00 h
    2
    0.00 0.33 0.00 h
    1
    h
    3
    h
    2
    h
    1
    0.00 0.99 0.00 h
    3
    3.01 0.99 3.01 h
    2
    0.00 0.99 0.00 h
    1
    h
    3
    h
    2
    h
    1
    Belief in parameters

    View full-size slide

  13. 13
    Empirical observations
    1.0
    2/3
    1/3
    1

    View full-size slide

  14. 14
    Which hypothesis is the most
    plausible one?

    View full-size slide

  15. 15
    Bayesian model comparison:
    marginal likelihood

    View full-size slide

  16. 16
    Bayesian model comparison:
    marginal likelihood
    Probability of parameters
    before observing data

    View full-size slide

  17. 17
    Bayesian model comparison:
    marginal likelihood
    Probability of parameters
    before observing data
    Hypothesis

    View full-size slide

  18. 18
    Structure of HypTrails
    MC Model
    MC Model
    Hypothesis
    (H1)
    Hypothesis
    (H1)
    Belief in parameters
    Prior (H1)
    Prior (H1)
    Elicitation
    Data (Trails)
    Data (Trails)
    Marginal
    likelihood (H1)
    Marginal
    likelihood (H1)
    Influence
    Influence

    View full-size slide

  19. 19
    How to elicit priors from
    expressed hypotheses?

    View full-size slide

  20. 20
    Conjugate Dirichlet prior

    Hyperparameters: pseudo counts

    View full-size slide

  21. 21
    Conjugate Dirichlet prior

    Hyperparameters: pseudo counts
    Hypothesis parameters Dirichlet hyperparameters

    View full-size slide

  22. 22
    Elicitation

    Multiply row-normalized hypothesis matrix with
    concentration parameter k

    Higher k → stronger belief

    Additional proto-prior
    Hypothesis parameters Dirichlet hyperparameters

    View full-size slide

  23. 23
    2 state example: Beta prior
    Hypothesis:

    View full-size slide

  24. 24
    2 state example: Beta prior
    Hypothesis:
    k = 0

    View full-size slide

  25. 25
    2 state example: Beta prior
    Hypothesis:
    k = 1

    View full-size slide

  26. 26
    2 state example: Beta prior
    Hypothesis:
    k = 10

    View full-size slide

  27. 27
    2 state example: Beta prior
    Hypothesis:
    k = 100

    View full-size slide

  28. 28
    Example: Structural hypothesis
    proto
    prior

    View full-size slide

  29. 29
    Structure of HypTrails
    MC Model
    MC Model
    Hypothesis
    (H1)
    Hypothesis
    (H1)
    Dirichlet Prior
    (H1)
    Dirichlet Prior
    (H1)
    Data (Trails)
    Data (Trails)
    Marginal
    likelihood (H1)
    Marginal
    likelihood (H1)
    Hypothesis
    (H2)
    Hypothesis
    (H2)
    Dirichlet Prior
    (H2)
    Dirichlet Prior
    (H2)
    Marginal
    likelihood (H2)
    Marginal
    likelihood (H2)
    Compare
    Belief in parameters
    Elicitation
    Influence
    Influence

    View full-size slide

  30. 30
    Example result: Last.fm
    0 1 2 3 4
    hypothesis weighting factor k
    −1.55
    −1.50
    −1.45
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    evidence
    1e5
    uniform
    self-loop
    track date
    similarity
    Higher
    plausibility
    Higher belief

    View full-size slide

  31. 31
    Example result: Last.fm
    0 1 2 3 4
    hypothesis weighting factor k
    −1.55
    −1.50
    −1.45
    −1.40
    −1.35
    −1.30
    −1.25
    −1.20
    −1.15
    −1.10
    evidence
    1e5
    uniform
    self-loop
    track date
    similarity

    View full-size slide

  32. Hands-on jupyter notebook

    View full-size slide

  33. 33
    Further applications

    Ontology engineering – edit sequences
    [Walk et al. ISWC 2015]

    Real-world navigational trails
    – Flickr [Becker et al. SocialCom 2015]
    – Taxi data [Espín-Noboa et al. WWW 2016]

    Car data [Atzmüller et al. WWW 2016]

    Wikipedia co-editing patterns
    [Samoilenko et al. 2016]

    View full-size slide

  34. 34
    Methodological extensions

    Detect and model heterogeneity in data

    Higher-order Markov chain models

    Adaption for other models

    View full-size slide

  35. 35
    What have we learned?

    Comparing hypotheses about sequential data

    Bayesian approach: HypTrails

    Applications

    View full-size slide

  36. for your attention!
    T
    T
    H
    H
    A
    A
    N
    N
    K
    K
    S
    S
    @ph_singer
    www.philippsinger.info florian.lemmerich.net

    View full-size slide

  37. 38
    References 1/2
    [West et al. WWW 2015] Robert West, Ashwin Paranjape, and Jure Leskovec: Mining Missing Hyperlinks from
    Human Navigation Traces: A Case Study of Wikipedia. 24th International World Wide Web Conference
    (WWW'15), Florence, Italy, 2015.
    [Singer et al. IJSWIS 2013] Philipp Singer, Thomas Niebler, Markus Strohmaier and Andreas Hotho, Computing
    Semantic Relatedness from Human Navigational Paths: A Case Study on Wikipedia, International Journal on
    Semantic Web and Information Systems (IJSWIS), vol 9(4), 41-70, 2013
    [West & Leskovec WWW 2012] Robert West and Jure Leskovec: Human Wayfinding in Information Networks 21st
    International World Wide Web Conference (WWW'12), pp. 619–628, Lyon, France, 2012.
    [Chi et al. CHI 2001] Chi, Ed H., et al. "Using information scent to model user information needs and actions and
    the Web." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2001.
    [Singer et al. WWW 2015] Singer, P., Helic, D., Hotho, A., and Strohmaier, M. (2015, May). Hyptrails: A bayesian
    approach for comparing hypotheses about human trails on the web. In Proceedings of the 24th International
    Conference on World Wide Web (pp. 1003-1013). International World Wide Web Conferences Steering
    Committee.
    [Walk et al. ISWC 2015] Simon Walk, Philipp Singer, Lisette Espín Noboa, Tania Tudorache, Mark A. Musen and
    Markus Strohmaier, Understanding How Users Edit Ontologies: Comparing Hypotheses About Four Real-World
    Projects, 14th International Semantic Web Conference, Betlehem, Pennsylvania, USA, 2015

    View full-size slide

  38. 39
    References 2/2
    [Becker et al. SocialCom 2015] Martin Becker, Philipp Singer, Florian Lemmerich,
    Andreas Hotho, Denis Helic and Markus Strohmaier, Photowalking the City: Comparing
    Hypotheses About Urban Photo Trails on Flickr, 7th International Conference on Social
    Informatics, Beijing, China, 2015
    [Espín-Noboa et al. WWW 2016] Lisette Espín-Noboa, Florian Lemmerich, Philipp
    Singer and Markus Strohmaier, Discovering and Characterizing Mobility Patterns in
    Urban Spaces: A Study of Manhattan Taxi Data, 6th International Workshop on Location
    and the Web at WWW2016, Montreal, Canada, 2016
    [Samoilenko et al. 2016] Samoilenko, A., Karimi, F., Edler, D., Kunegis, J., & Strohmaier,
    M. (2016). Linguistic neighbourhoods: explaining cultural borders on Wikipedia through
    multilingual co-editing activity. EPJ Data Science, 5(1), 1., 2016
    [Atzmüller et al. WWW 2016] Atzmueller, M., Schmidt, A., & Kibanov, M. (2016).
    DASHTrails: An Approach for Modeling and Analysis of Distribution-Adapted Sequential
    Hypotheses and Trails. In Proceedings of the World Wide Web Conference Companion,
    2016

    View full-size slide