Slide 1

Slide 1 text

Part 4 Comparing Hypotheses about Sequential Data

Slide 2

Slide 2 text

2 Example: Human Navigation ● Humans prefer to navigate… – H1: over semantically similar websites – H2: via self-loops (e.g., refreshing) – H3: by using the structural link network – H4: by preferring similar categories – H5: by utilizing structural properties – H6: by information scent [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001]

Slide 3

Slide 3 text

3 Example: Human Navigation ● Humans prefer to navigate… – H1: over semantically similar websites – H2: via self-loops (e.g., refreshing) – H3: by using the structural link network – H4: by preferring similar categories – H5: by utilizing structural properties – H6: by information scent [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001] What is the relative plausibility of these hypotheses given data?

Slide 4

Slide 4 text

4 Example: Human Navigation ● Humans prefer to navigate… – H1: over semantically similar websites – H2: via self-loops (e.g., refreshing) – H3: by using the structural link network – H4: by preferring similar categories – H5: by utilizing structural properties – H6: by information scent [West et al. IJCAI 2009], [Singer et al. IJSWIS 2013], [West & Leskovec WWW 2012], [Chi et al. CHI 2001] HypTrails [Singer et al. WWW 2015]

Slide 5

Slide 5 text

5 HypTrails in a nutshell ● Goal: Express and compare hypotheses about sequences in a coherent research approach ● Method: – First-order Markov chain model – Bayesian inference ● Idea: – Incorporate hypotheses as priors – Utilize sensitivity of marginal likelihood on the prior ● Outcome: Partial ordering of hypotheses

Slide 6

Slide 6 text

Structure of HypTrails

Slide 7

Slide 7 text

7 Structure of HypTrails MC Model MC Model S1 S1 S2 S2 S3 S3 1/2 1/2 1/3 2/3 1

Slide 8

Slide 8 text

How to express hypotheses?

Slide 9

Slide 9 text

How to express hypotheses? As assumptions in parameters of Markov Chain model.

Slide 10

Slide 10 text

10 Structural hypothesis 1/3 1 1/3 1 1/3

Slide 11

Slide 11 text

11 Uniform hypothesis 1/3

Slide 12

Slide 12 text

12 Structure of HypTrails MC Model MC Model Hypothesis (H1) Hypothesis (H1) Belief in parameters 1 2 3 0.00 0.33 0.00 h 3 1.00 0.33 1.00 h 2 0.00 0.33 0.00 h 1 h 3 h 2 h 1 0.00 0.99 0.00 h 3 3.01 0.99 3.01 h 2 0.00 0.99 0.00 h 1 h 3 h 2 h 1 Belief in parameters

Slide 13

Slide 13 text

13 Empirical observations 1.0 2/3 1/3 1

Slide 14

Slide 14 text

14 Which hypothesis is the most plausible one?

Slide 15

Slide 15 text

15 Bayesian model comparison: marginal likelihood

Slide 16

Slide 16 text

16 Bayesian model comparison: marginal likelihood Probability of parameters before observing data

Slide 17

Slide 17 text

17 Bayesian model comparison: marginal likelihood Probability of parameters before observing data Hypothesis

Slide 18

Slide 18 text

18 Structure of HypTrails MC Model MC Model Hypothesis (H1) Hypothesis (H1) Belief in parameters Prior (H1) Prior (H1) Elicitation Data (Trails) Data (Trails) Marginal likelihood (H1) Marginal likelihood (H1) Influence Influence

Slide 19

Slide 19 text

19 How to elicit priors from expressed hypotheses?

Slide 20

Slide 20 text

20 Conjugate Dirichlet prior ● Hyperparameters: pseudo counts

Slide 21

Slide 21 text

21 Conjugate Dirichlet prior ● Hyperparameters: pseudo counts Hypothesis parameters Dirichlet hyperparameters

Slide 22

Slide 22 text

22 Elicitation ● Multiply row-normalized hypothesis matrix with concentration parameter k ● Higher k → stronger belief ● Additional proto-prior Hypothesis parameters Dirichlet hyperparameters

Slide 23

Slide 23 text

23 2 state example: Beta prior Hypothesis:

Slide 24

Slide 24 text

24 2 state example: Beta prior Hypothesis: k = 0

Slide 25

Slide 25 text

25 2 state example: Beta prior Hypothesis: k = 1

Slide 26

Slide 26 text

26 2 state example: Beta prior Hypothesis: k = 10

Slide 27

Slide 27 text

27 2 state example: Beta prior Hypothesis: k = 100

Slide 28

Slide 28 text

28 Example: Structural hypothesis proto prior

Slide 29

Slide 29 text

29 Structure of HypTrails MC Model MC Model Hypothesis (H1) Hypothesis (H1) Dirichlet Prior (H1) Dirichlet Prior (H1) Data (Trails) Data (Trails) Marginal likelihood (H1) Marginal likelihood (H1) Hypothesis (H2) Hypothesis (H2) Dirichlet Prior (H2) Dirichlet Prior (H2) Marginal likelihood (H2) Marginal likelihood (H2) Compare Belief in parameters Elicitation Influence Influence

Slide 30

Slide 30 text

30 Example result: Last.fm 0 1 2 3 4 hypothesis weighting factor k −1.55 −1.50 −1.45 −1.40 −1.35 −1.30 −1.25 −1.20 −1.15 −1.10 evidence 1e5 uniform self-loop track date similarity Higher plausibility Higher belief

Slide 31

Slide 31 text

31 Example result: Last.fm 0 1 2 3 4 hypothesis weighting factor k −1.55 −1.50 −1.45 −1.40 −1.35 −1.30 −1.25 −1.20 −1.15 −1.10 evidence 1e5 uniform self-loop track date similarity

Slide 32

Slide 32 text

Hands-on jupyter notebook

Slide 33

Slide 33 text

33 Further applications ● Ontology engineering – edit sequences [Walk et al. ISWC 2015] ● Real-world navigational trails – Flickr [Becker et al. SocialCom 2015] – Taxi data [Espín-Noboa et al. WWW 2016] – Car data [Atzmüller et al. WWW 2016] ● Wikipedia co-editing patterns [Samoilenko et al. 2016]

Slide 34

Slide 34 text

34 Methodological extensions ● Detect and model heterogeneity in data ● Higher-order Markov chain models ● Adaption for other models

Slide 35

Slide 35 text

35 What have we learned? ● Comparing hypotheses about sequential data ● Bayesian approach: HypTrails ● Applications

Slide 36

Slide 36 text

Questions?

Slide 37

Slide 37 text

for your attention! T T H H A A N N K K S S @ph_singer www.philippsinger.info florian.lemmerich.net

Slide 38

Slide 38 text

38 References 1/2 [West et al. WWW 2015] Robert West, Ashwin Paranjape, and Jure Leskovec: Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia. 24th International World Wide Web Conference (WWW'15), Florence, Italy, 2015. [Singer et al. IJSWIS 2013] Philipp Singer, Thomas Niebler, Markus Strohmaier and Andreas Hotho, Computing Semantic Relatedness from Human Navigational Paths: A Case Study on Wikipedia, International Journal on Semantic Web and Information Systems (IJSWIS), vol 9(4), 41-70, 2013 [West & Leskovec WWW 2012] Robert West and Jure Leskovec: Human Wayfinding in Information Networks 21st International World Wide Web Conference (WWW'12), pp. 619–628, Lyon, France, 2012. [Chi et al. CHI 2001] Chi, Ed H., et al. "Using information scent to model user information needs and actions and the Web." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2001. [Singer et al. WWW 2015] Singer, P., Helic, D., Hotho, A., and Strohmaier, M. (2015, May). Hyptrails: A bayesian approach for comparing hypotheses about human trails on the web. In Proceedings of the 24th International Conference on World Wide Web (pp. 1003-1013). International World Wide Web Conferences Steering Committee. [Walk et al. ISWC 2015] Simon Walk, Philipp Singer, Lisette Espín Noboa, Tania Tudorache, Mark A. Musen and Markus Strohmaier, Understanding How Users Edit Ontologies: Comparing Hypotheses About Four Real-World Projects, 14th International Semantic Web Conference, Betlehem, Pennsylvania, USA, 2015

Slide 39

Slide 39 text

39 References 2/2 [Becker et al. SocialCom 2015] Martin Becker, Philipp Singer, Florian Lemmerich, Andreas Hotho, Denis Helic and Markus Strohmaier, Photowalking the City: Comparing Hypotheses About Urban Photo Trails on Flickr, 7th International Conference on Social Informatics, Beijing, China, 2015 [Espín-Noboa et al. WWW 2016] Lisette Espín-Noboa, Florian Lemmerich, Philipp Singer and Markus Strohmaier, Discovering and Characterizing Mobility Patterns in Urban Spaces: A Study of Manhattan Taxi Data, 6th International Workshop on Location and the Web at WWW2016, Montreal, Canada, 2016 [Samoilenko et al. 2016] Samoilenko, A., Karimi, F., Edler, D., Kunegis, J., & Strohmaier, M. (2016). Linguistic neighbourhoods: explaining cultural borders on Wikipedia through multilingual co-editing activity. EPJ Data Science, 5(1), 1., 2016 [Atzmüller et al. WWW 2016] Atzmueller, M., Schmidt, A., & Kibanov, M. (2016). DASHTrails: An Approach for Modeling and Analysis of Distribution-Adapted Sequential Hypotheses and Trails. In Proceedings of the World Wide Web Conference Companion, 2016