Upgrade to Pro — share decks privately, control downloads, hide ads and more …

You Are Where You Eat: Foursquare Checkins as Indicators of Human Mobility and Behaviour

Matt J Williams
March 19, 2012
62

You Are Where You Eat: Foursquare Checkins as Indicators of Human Mobility and Behaviour

Research talk.
Venue: International Workshop on the Impact of Human Mobility in Pervasive Systems and Applications (PerMoby), Lugano, Switzerland.

Matt J Williams

March 19, 2012
Tweet

More Decks by Matt J Williams

Transcript

  1. You are where you eat:
    Foursquare checkins as indicators of human
    mobility and behaviour
    International Workshop on the Impact of
    Human Mobility in Pervasive Systems and
    Applications
    (PerMoby)
    19 March 2012
    Gualtiero Colombo, Martin Chorley, Matthew Williams,
    Stuart Allen, Roger Whitaker
    Cardiff University
    School of Computer Science
    & Informatics

    View full-size slide

  2. Overview
    • Background
    • Motivation
    • Foursquare
    • Data collection
    • Observations

    View full-size slide

  3. Motivation
    • Areas of study:
    • Presence of routine (regularity) in mobility and encounters
    • Relationship between personality traits and mobility
    behaviour
    • Heterogeneity in individuals’ behaviours
    • Applications:
    • Content provisioning
    • User profiling
    • Recommender systems

    View full-size slide

  4. Why Foursquare?
    • Appropriate datasets hard to find!
    • In addition to the mobility trace, we want:
    • social graph
    • profiles of individuals
    • properties of the places individuals visit
    • ...and comprehensive coverage of a
    geographic region!

    View full-size slide

  5. About Foursquare
    • “Location-based online social
    network”
    • Users ‘check-in’ to their current
    venue to indicate they’ve visited it
    • Venues are user-contributed
    • Points, “mayorships”, and discounts to
    incentivise participation

    View full-size slide

  6. About Foursquare
    visit history
    social graph
    user
    database
    venue
    database
    + category
    hierarchy

    View full-size slide

  7. World Foursquare usage
    Text
    http://blog.foursquare.com/2011/01/24/2010infographic/

    View full-size slide

  8. Data collection
    Cardiff
    city area: 140 km2
    city population: 341,000

    View full-size slide

  9. city area: 116 km2
    Data collection
    Cambridge
    city population: 130,000

    View full-size slide

  10. Collected data
    City
    pop.
    Collection
    area
    # users
    (≥ 1 visit)
    # venues
    (≥ 1 visit)
    Checkins
    Checkins
    Checkins
    City
    pop.
    Collection
    area
    # users
    (≥ 1 visit)
    # venues
    (≥ 1 visit)
    Total
    per
    venue
    per
    user
    320,000 7.0 x 9.0 km 1,701 1,234 13,299 10.78 7.82
    120,000 5.0 x 3.5 km 1,196 852 6,464 7.59 5.40
    Collection period: Mon 21st March – Fri 13th May
    53 continuous days
    Cardiff
    Camb.

    View full-size slide

  11. Checkins heatmap
    Cardiff Cambridge
    2km
    2km
    fewest
    most

    View full-size slide

  12. Areas of analysis
    • User activity and venue popularity
    • Inter-checkin time and distance
    • Co-located checkins
    • Checkin patterns

    View full-size slide

  13. User activity and venue
    popularity

    View full-size slide

  14. User activity
    1 10 100
    Number of Checkins
    1
    10
    100
    1000
    Number of Users
    Cardiff
    Cambridge
    • Users with exactly
    one checkin:
    • Cambridge: 31%
    • Cardiff: 43%
    • Top 1% of users
    responsible for 15%
    of all checkins

    View full-size slide

  15. Venue popularity
    1 10 100
    Number of Checkins
    1
    10
    100
    Number of Venues
    Cardiff
    Cambridge
    • Small minority of
    very-popular venues
    • Most-popular tend to
    be transport hubs
    • A large number of
    venues with very few
    checkins
    • Usually “Home”
    venues

    View full-size slide

  16. Inter-checkin time and
    distance

    View full-size slide

  17. Inter-checkin time and jump distance
    A→B
    5min
    300m
    B→C
    3hrs
    100m
    • Jump distance:
    • (or: “inter-checkin
    distance”)
    • distance between two
    consecutive checkins
    • Inter-checkin time:
    • time between two
    consecutive checkins
    • staying time + travel time

    View full-size slide

  18. 0 10000 20000 30000 40000 50000 60000 70000 80000
    Time between Checkins (seconds)
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    P(X  x)
    Time between Checkins
    Cardiff
    Cambridge
    Inter-checkin time
    • 50% of consecutive
    checkins within five
    hours
    • 20% of consecutive
    checkins within one
    hour
    Cardiff &
    Camb.
    1h
    3h
    12h
    24h
    5h
    inter-checkin time ≤ 24hr

    View full-size slide

  19. 0 1000 2000 3000 4000 5000 6000 7000
    Distance between Checkins (metres)
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    P(X  x)
    Distance between Checkins
    3 Hours
    6 Hours
    12 Hours
    1 Day
    2 Days
    1 Week
    1 Month
    Max inter-
    checkin time:
    1k
    0.5k
    Jump distance
    Cardiff
    Cardiff:
    • 60% to 70% of
    consecutive checkins
    within 1km
    • 48% to 55% of
    consecutive checkins
    within 0.5km
    (...depending on max inter-
    checkin time)

    View full-size slide

  20. 0 2000 4000 6000 8000 10000 12000 14000
    Distance between Checkins (metres)
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    P(X  x)
    Distance between Checkins
    Cardiff
    Cambridge
    Jump distance
    • For consecutive checkins
    within three hours...
    • Cardiff: 75% of jumps less
    than 1km
    • Camb.: 66% of jumps less
    than 1km
    • On average, Cambridge users
    travel farther between
    checkins
    Cardiff &
    Camb.
    1km
    inter-checkin time ≤ 3hr

    View full-size slide

  21. Co-located checkins

    View full-size slide

  22. Co-visiting behaviour
    • The co-visiting patterns of users:
    • Two users checking in at the same venue within one
    hour of each other are said to have “co-visited”
    • Other co-visit thresholds can be chosen
    • How many users ‘meet’ in this way?
    • Does friendship influence co-visiting behaviour?

    View full-size slide

  23. Number of co-visitors
    • At a co-visit threshold
    of one hour:
    10% of users co-visited
    with more than 10 different
    people
    35% of users co-visited
    with at least one other
    person
    The most people a user
    met is 35
    0 50 100 150 200 250 300 350
    Number of Users
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    P(X  x)
    Number of Unique Users with Co-located Checkins (per User)
    1 Hour
    3 Hours
    6 Hours
    12 Hours
    1 Day
    2 Days
    Cardiff
    Co-visit
    threshold:

    View full-size slide

  24. Time between co-visits: friends vs. non-friends
    In Cardiff...
    average time between co-visits in a
    one-hour threshold...
    between friends: 8.3 mins
    between any users: 18.2 mins
    average time between co-visits in a
    three-hour threshold...
    between friends: 27 mins
    between any users: 63 mins
    0 2000 4000 6000 8000 10000
    Time between Co-located Checkins (seconds)
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    P(X  x)
    Time between Co-located Checkins
    All Users - 1 Hours
    Friends - 1 Hours
    All Users - 3 Hours
    Friends - 3 Hours
    Cardiff

    View full-size slide

  25. Checkin patterns

    View full-size slide

  26. Sequence analysis
    • Can we find repeated patterns of checkins?
    • Look at n-grams frequencies...
    ... Central
    Train Station
    Starbucks
    School of
    ComSc
    Uni Cafe
    → → →
    → →...
    8:30am 8:45am 9:00am 10:30am
    A-B-C-D-A-B-C-C-A-B-C
    example string:
    sequence
    count
    (#repeats)
    ABC 3
    BCD 1
    CDA 1
    BCC 1
    CCA 1
    CAB 1
    DAB 1
    3-grams

    View full-size slide

  27. Recurring sequences per user
    0 10 20 30 40 50 60 70
    Number of unique tuples
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    P(X  x)
    Distribution of unique tuples
    quadruples44u
    triples33u
    doubles22u
    Number of recurring sequences per user
    Number of recurring sequences
    Cardiff
    2-grams:
    • 84% of users with no recurring
    sequences
    • 10% of users had between one and five
    recurring sequences
    • 6% of users had more than five
    recurring sequences
    3-grams:
    • 90% of users with no recurring
    sequences
    • 6% of users had between one and five
    recurring sequences
    • 4% of users had more than five
    recurring sequences

    View full-size slide

  28. Fuzzy sequences
    • Allow different intermediate venues
    ... Central
    Train Station
    Starbucks?
    School of
    ComSc
    Uni Cafe
    → → →
    → →...
    Costa Coffe?
    Caffe Nero?
    • Allow up to n intermediate venues when matching a pattern
    • Similar to regular expression matching...
    Pattern:
    C *{0,2} A
    ...has two matches
    A-B-C-D-A-B-C-C-A-B-C

    View full-size slide

  29. Distinct fuzzy sequences per user
    100 101 102 103 104
    Number of doubles
    100
    P(X  x)
    Distribution of doubles
    doubles2u
    doubles2
    doubles3
    doubles5
    doubles10

    View full-size slide

  30. Summary
    • Users tend to make frequent and regular checkins to a limited number of venues
    • A small subset of users show repeated sequences of checkins
    • The type of venue affects regularity:
    • Home and Work venues are very regular; Outdoors venues are less regular
    • Movement of friends influences co-visit behaviour
    • City-specific characteristics affects user behaviours
    • Temporal behaviour is universal, but jump distance affected by geography?

    View full-size slide

  31. Ongoing and future
    research
    • Individual checkin patterns: regularity, predictability,
    heterogeneity
    • Influence of friendship on co-visiting behaviour --
    causality or commonality?
    • Relationship between personality traits and visiting
    behaviour

    View full-size slide

  32. Thanks for listening!
    Questions?
    Matt Williams
    www.mattjw.net
    [email protected]
    @voxmjw
    www.gplus.to/mattjw
    Supported by...
    { G.Colombo,
    M.J.Chorley,
    M.J.Williams,
    Stuart.M.Allen,
    R.M.Whitaker }
    @cs.cardiff.ac.uk
    www.recognition-project.eu

    View full-size slide