Slide 1

Slide 1 text

You are where you eat: Foursquare checkins as indicators of human mobility and behaviour International Workshop on the Impact of Human Mobility in Pervasive Systems and Applications (PerMoby) 19 March 2012 Gualtiero Colombo, Martin Chorley, Matthew Williams, Stuart Allen, Roger Whitaker Cardiff University School of Computer Science & Informatics

Slide 2

Slide 2 text

Overview • Background • Motivation • Foursquare • Data collection • Observations

Slide 3

Slide 3 text

Motivation • Areas of study: • Presence of routine (regularity) in mobility and encounters • Relationship between personality traits and mobility behaviour • Heterogeneity in individuals’ behaviours • Applications: • Content provisioning • User profiling • Recommender systems

Slide 4

Slide 4 text

Why Foursquare? • Appropriate datasets hard to find! • In addition to the mobility trace, we want: • social graph • profiles of individuals • properties of the places individuals visit • ...and comprehensive coverage of a geographic region!

Slide 5

Slide 5 text

About Foursquare • “Location-based online social network” • Users ‘check-in’ to their current venue to indicate they’ve visited it • Venues are user-contributed • Points, “mayorships”, and discounts to incentivise participation

Slide 6

Slide 6 text

About Foursquare visit history social graph user database venue database + category hierarchy

Slide 7

Slide 7 text

World Foursquare usage Text http://blog.foursquare.com/2011/01/24/2010infographic/

Slide 8

Slide 8 text

Data collection Cardiff city area: 140 km2 city population: 341,000

Slide 9

Slide 9 text

city area: 116 km2 Data collection Cambridge city population: 130,000

Slide 10

Slide 10 text

Collected data City pop. Collection area # users (≥ 1 visit) # venues (≥ 1 visit) Checkins Checkins Checkins City pop. Collection area # users (≥ 1 visit) # venues (≥ 1 visit) Total per venue per user 320,000 7.0 x 9.0 km 1,701 1,234 13,299 10.78 7.82 120,000 5.0 x 3.5 km 1,196 852 6,464 7.59 5.40 Collection period: Mon 21st March – Fri 13th May 53 continuous days Cardiff Camb.

Slide 11

Slide 11 text

Checkins heatmap Cardiff Cambridge 2km 2km fewest most

Slide 12

Slide 12 text

Areas of analysis • User activity and venue popularity • Inter-checkin time and distance • Co-located checkins • Checkin patterns

Slide 13

Slide 13 text

User activity and venue popularity

Slide 14

Slide 14 text

User activity 1 10 100 Number of Checkins 1 10 100 1000 Number of Users Cardiff Cambridge • Users with exactly one checkin: • Cambridge: 31% • Cardiff: 43% • Top 1% of users responsible for 15% of all checkins

Slide 15

Slide 15 text

Venue popularity 1 10 100 Number of Checkins 1 10 100 Number of Venues Cardiff Cambridge • Small minority of very-popular venues • Most-popular tend to be transport hubs • A large number of venues with very few checkins • Usually “Home” venues

Slide 16

Slide 16 text

Inter-checkin time and distance

Slide 17

Slide 17 text

Inter-checkin time and jump distance A→B 5min 300m B→C 3hrs 100m • Jump distance: • (or: “inter-checkin distance”) • distance between two consecutive checkins • Inter-checkin time: • time between two consecutive checkins • staying time + travel time

Slide 18

Slide 18 text

0 10000 20000 30000 40000 50000 60000 70000 80000 Time between Checkins (seconds) 0.0 0.2 0.4 0.6 0.8 1.0 P(X  x) Time between Checkins Cardiff Cambridge Inter-checkin time • 50% of consecutive checkins within five hours • 20% of consecutive checkins within one hour Cardiff & Camb. 1h 3h 12h 24h 5h inter-checkin time ≤ 24hr

Slide 19

Slide 19 text

0 1000 2000 3000 4000 5000 6000 7000 Distance between Checkins (metres) 0.0 0.2 0.4 0.6 0.8 1.0 P(X  x) Distance between Checkins 3 Hours 6 Hours 12 Hours 1 Day 2 Days 1 Week 1 Month Max inter- checkin time: 1k 0.5k Jump distance Cardiff Cardiff: • 60% to 70% of consecutive checkins within 1km • 48% to 55% of consecutive checkins within 0.5km (...depending on max inter- checkin time)

Slide 20

Slide 20 text

0 2000 4000 6000 8000 10000 12000 14000 Distance between Checkins (metres) 0.0 0.2 0.4 0.6 0.8 1.0 P(X  x) Distance between Checkins Cardiff Cambridge Jump distance • For consecutive checkins within three hours... • Cardiff: 75% of jumps less than 1km • Camb.: 66% of jumps less than 1km • On average, Cambridge users travel farther between checkins Cardiff & Camb. 1km inter-checkin time ≤ 3hr

Slide 21

Slide 21 text

Co-located checkins

Slide 22

Slide 22 text

Co-visiting behaviour • The co-visiting patterns of users: • Two users checking in at the same venue within one hour of each other are said to have “co-visited” • Other co-visit thresholds can be chosen • How many users ‘meet’ in this way? • Does friendship influence co-visiting behaviour?

Slide 23

Slide 23 text

Number of co-visitors • At a co-visit threshold of one hour: 10% of users co-visited with more than 10 different people 35% of users co-visited with at least one other person The most people a user met is 35 0 50 100 150 200 250 300 350 Number of Users 0.0 0.2 0.4 0.6 0.8 1.0 P(X  x) Number of Unique Users with Co-located Checkins (per User) 1 Hour 3 Hours 6 Hours 12 Hours 1 Day 2 Days Cardiff Co-visit threshold:

Slide 24

Slide 24 text

Time between co-visits: friends vs. non-friends In Cardiff... average time between co-visits in a one-hour threshold... between friends: 8.3 mins between any users: 18.2 mins average time between co-visits in a three-hour threshold... between friends: 27 mins between any users: 63 mins 0 2000 4000 6000 8000 10000 Time between Co-located Checkins (seconds) 0.0 0.2 0.4 0.6 0.8 1.0 P(X  x) Time between Co-located Checkins All Users - 1 Hours Friends - 1 Hours All Users - 3 Hours Friends - 3 Hours Cardiff

Slide 25

Slide 25 text

Checkin patterns

Slide 26

Slide 26 text

Sequence analysis • Can we find repeated patterns of checkins? • Look at n-grams frequencies... ... Central Train Station Starbucks School of ComSc Uni Cafe → → → → →... 8:30am 8:45am 9:00am 10:30am A-B-C-D-A-B-C-C-A-B-C example string: sequence count (#repeats) ABC 3 BCD 1 CDA 1 BCC 1 CCA 1 CAB 1 DAB 1 3-grams

Slide 27

Slide 27 text

Recurring sequences per user 0 10 20 30 40 50 60 70 Number of unique tuples 0.0 0.2 0.4 0.6 0.8 1.0 P(X  x) Distribution of unique tuples quadruples44u triples33u doubles22u Number of recurring sequences per user Number of recurring sequences Cardiff 2-grams: • 84% of users with no recurring sequences • 10% of users had between one and five recurring sequences • 6% of users had more than five recurring sequences 3-grams: • 90% of users with no recurring sequences • 6% of users had between one and five recurring sequences • 4% of users had more than five recurring sequences

Slide 28

Slide 28 text

Fuzzy sequences • Allow different intermediate venues ... Central Train Station Starbucks? School of ComSc Uni Cafe → → → → →... Costa Coffe? Caffe Nero? • Allow up to n intermediate venues when matching a pattern • Similar to regular expression matching... Pattern: C *{0,2} A ...has two matches A-B-C-D-A-B-C-C-A-B-C

Slide 29

Slide 29 text

Distinct fuzzy sequences per user 100 101 102 103 104 Number of doubles 100 P(X  x) Distribution of doubles doubles2u doubles2 doubles3 doubles5 doubles10

Slide 30

Slide 30 text

Summary • Users tend to make frequent and regular checkins to a limited number of venues • A small subset of users show repeated sequences of checkins • The type of venue affects regularity: • Home and Work venues are very regular; Outdoors venues are less regular • Movement of friends influences co-visit behaviour • City-specific characteristics affects user behaviours • Temporal behaviour is universal, but jump distance affected by geography?

Slide 31

Slide 31 text

Ongoing and future research • Individual checkin patterns: regularity, predictability, heterogeneity • Influence of friendship on co-visiting behaviour -- causality or commonality? • Relationship between personality traits and visiting behaviour

Slide 32

Slide 32 text

Thanks for listening! Questions? Matt Williams www.mattjw.net M.J.Williams@cs.cardiff.uk @voxmjw www.gplus.to/mattjw Supported by... { G.Colombo, M.J.Chorley, M.J.Williams, Stuart.M.Allen, R.M.Whitaker } @cs.cardiff.ac.uk www.recognition-project.eu