Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TravelOAC: development of travel geodemographic classifications for England and Wales based on open data

nickbearman
April 15, 2015
23k

TravelOAC: development of travel geodemographic classifications for England and Wales based on open data

Presented at GISRUK2015, University of Leeds, Wed 15th April 2015

nickbearman

April 15, 2015
Tweet

More Decks by nickbearman

Transcript

  1. Dr  Nick  Bearman,  CGeog  (GIS)  
    Geographic  Data  Science  Lab  
    Department  of  Geography  and  Planning  
    TravelOAC:  development  of  travel  
    geodemographic  classifica9ons  for  England  
    and  Wales  based  on  open  data  
    TwiBer:  @nickbearmanuk  
    "Cyclists  at  red  2"  by  [email protected]  Commons  (mail)  -­‐  Own  work.  Licensed  under  CC  BY-­‐SA  3.0  via  Wikimedia  Commons  -­‐  hBp://commons.wikimedia.org/
    wiki/File:Cyclists_at_red_2.jpg#/media/File:Cyclists_at_red_2.jpg  
    epSos.de,  hBps://www.flickr.com/photos/epsos/5591761716/  

    View Slide

  2. Developing  a  geodemographic  
    classifica^on  for  travel  
    •  Travel  
    •  Geodemographics  
    – Variables  
    – Rou^ng  
    •  Classifica^on  &  Clusters  
    •  Findings  
    •  The  Future  

    View Slide

  3. Background    
    •  Travel  is  vital  
    •  Many  different  factors  influence  our  choice  of  
    method  of  travel  
    •  Travel  choice  is    
    important  
    –  CO2
     emissions  
    –  Conges^on  
    –  Cost  /  Time  
    –  Availability  
    –  Infrastructure  development  
    •  This  analysis  is  possible  due  to  big  data  analysis  
    and  2011  Census  
    "High  Five  Interchange".  Licensed  under  CC  BY  2.0  via  Wikimedia  Commons  -­‐  hBp://commons.wikimedia.org/wiki/
    File:High_Five_Interchange.jpg#/media/File:High_Five_Interchange.jpg  

    View Slide

  4. Geodemographics  
    •  Classifica^on  of  people  by  where  they  live  
    •  2001  OAC  and  2011  OAC  started  development  
    of  open  geodemographics  
    •  Openness  allows  development  of  targeted  
    geodemographics  and  applying  
    geodemographcis  to  custom  data  sets:  
    – Internet  (Riddlesden  and  Singleton,  2014)  
    – Retail  (Dolega  and  Singleton,  2014)  
    – Consumer  Data  (many  examples)  
    – Transport  
    Riddlesden  and  Singleton,  2014,  “Broadband  Speed  Equity:  A  New  Digital  Divide?”  Applied  Geography  52  (August):  25–33.  doi:10.1016/j.apgeog.2014.04.008.  
    Dolega  and  Singleton,  2014,  E-­‐Resillience  of  Bri^sh  retail  centres,hBp://geographicdatascience.com/talk/2014/12/18/regional-­‐studies/  

    View Slide

  5. Variable  Selec^on  
    Domains   Concepts   Variable   Census  table  used  
    Demography   Gender   Gender   KS101  Usual  resident  
    popula^on  
    Age   Age  groups   KS102EW  Age  
    structure  
    Social  Class   Na^onal  Sta^s^cs  socio-­‐
    economic  class  
    KS102EW  NS-­‐SeC  
    Transport   Travel  to  work   Mode  of  usual  travel  to  work   QS701EW  Method  of  
    travel  to  work  
    Ease  of  access  
    to  car  
    Car  ownership   KS404EW  Car  or  van  
    availability  
    Ease  of  access  
    to  public  
    transport  
    Distance  to  closest  bus/tram/
    train/ferry/airport  stop  
    NA  (distance  
    calculated  from  
    NaPTAN  data)  

    View Slide

  6. Distance  to  closest  transport  stop  
    •  NaPTAN  –  Na^onal  Public  Transport  Access  
    Node  database  
    •  Could  use  straight  line    
    distance  (as  the  crow  flies)  
    •  But  for  bus,  tram  &  rail  
    this  makes  liBle  sense  
    – Walking  routes  are  more    
    representa^ve  of  reality  
    – (Not  for  airport  or  ferry)  
    Walking  route  modeled  at  hBp://www.rou^no.org/  on  20150302,  Router:  Rou^no  |  Geo  Data:  ©  OpenStreetMap  contributors  |  Tiles:  ©  OpenStreetMap  

    View Slide

  7. •  Street  network  -­‐  OpenStreetMap  &  Rou^no  
    – Walk  to  nearest  stop  
    Input:  
    •  Origin  -­‐  Des^na^on  -­‐  Mode  
    Output:  Text  file  
    •  Distance  
    •  Route  
    Walking  route  modeled  at  hBp://www.rou^no.org/  on  20150302,  Router:  Rou^no  |  Geo  Data:  ©  OpenStreetMap  contributors  |  Tiles:  ©  OpenStreetMap  

    View Slide

  8. Rou^ng  Analysis  
    •  Run  ^me  for  this  data  analysis  (24  hours)    
    – for  181,408  routes  (each  OA  centroid)  
    – 5  transport  methods  
    •  Use  of  R  to  generate  and    
    manage  data  
    •  Lots  of  big  data    
    opportuni^es  
    Walking  route  modeled  at  hBp://www.rou^no.org/  on  20150302,  Router:  Rou^no  |  Geo  Data:  ©  OpenStreetMap  contributors  |  Tiles:  ©  OpenStreetMap  

    View Slide

  9. Classifica^on  
    •  Variables    
    •  Clustergram  (Galili,  2010;    
    Schonlau,  2002,  2004)  
    •  8  clusters  
    •  K-­‐means  clustering  
    – Classifica^on  
    Galili,  A.T.  (2010).  Clustergram:  visualiza^on  and  diagnos^cs  for  cluster  analysis  (R  code).  
    Schonlau,  M.  (2002).  The  clustergram:  A  graph  for  visualizing  hierarchical  and  nonhierarchical  cluster  analyses.  Stata  J.  2,  391–402.  
    Schonlau,  M.  (2004).  Visualizing  non-­‐hierarchical  and  hierarchical  cluster  analyses  with  clustergrams.  Comput.  Stat.  19,  95–111.  

    View Slide

  10. Clusters  
    •  8  clusters  
    •  Dis^nc^ve  spa^al  
    paBerns  and  data  
    paBerns  

    View Slide

  11. •  Cartogram  to  show  
    classifica^on  by  
    OA  
     
    Clusters  

    View Slide

  12. *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  
    •   
     
    Cartogram  generated  using  Scapetoad,    
    hBp://scapetoad.choros.ch    
    Clusters  

    View Slide

  13. *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  
    Clusters  

    View Slide

  14. *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  
    Clusters  

    View Slide

  15. *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  
    Clusters  

    View Slide

  16. *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  
    Clusters  

    View Slide

  17. *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  
    Clusters  

    View Slide

  18. Clusters  
    *For  distance,  posi^ve  values  are  higher  distances  than  average,    
    and  nega^ve  values  are  closer  than  average.    
    #1.  Higher  managerial,  administra^ve  and  professional  occupa^ons    
    2.  Lower  managerial,  administra^ve  and  professional  occupa^ons    
    3.  Intermediate  occupa^ons  
    4.  Small  employers  and  own  account  workers    
    5.  Lower  supervisory  and  technical  occupa^ons  
    6.  Semi-­‐rou^ne  occupa^ons  
    7.  Rou^ne  occupa^ons  
    8.  Never  worked  and  long-­‐term  unemployed  

    View Slide

  19. Findings  
    •  Income  (based  on  NS-­‐SeC)  is  important  factor  
    •  As  is  gender  (related  to  income)  
    •  Both  related  to  SES,  but  very    
    limited  understanding  of  the    
    mechanisms  behind  SES  
    •  Classifica^on  –  speckly,  so    
    perhaps  transport  has  limited    
    impact  on  loca^on?  

    View Slide

  20. What  the  results  are  useful  for  
    •  Understanding  transport  use  and  access  
    •  Do  the  two  factors  match?  
    •  Jus^fica^on  for  development  of  new  sta^ons  /  
    services  
    •  Applica^on  could  be    
    applied  to  more  refined    
    data  (e.g.  ^cket  sales,    
    usage  surveys,  etc.)  s^ll    
    using  the  rou^ng  element  
    "KingsCrossDevelopmentModel".  Licensed  under  CC  BY-­‐SA  2.0  via    
    Wikimedia  Commons  -­‐  hBp://commons.wikimedia.org/wiki/
    File:KingsCrossDevelopmentModel.jpg#/media/  
    File:KingsCrossDevelopmentModel.jpg  

    View Slide

  21. Future  developments  
    •  Transport  specific  geodemographic  could  be  
    developed  
    •  Extra  processing  power  allows  na^onal  
    analysis  of  transport  &  
    rou^ng  to  be  done  
     
    •  Rou^ng  allows  more  
    accurate  picture  of  
    accessibility  to  be  
    generated  
    Walking  route  modeled  at  hBp://www.rou^no.org/  on  20150302,  Router:  Rou^no  |  Geo  Data:  ©  OpenStreetMap  contributors  |  Tiles:  ©  OpenStreetMap  

    View Slide

  22. Ques^ons?  
    Dr  Nick  Bearman,  CGeog  (GIS)  
    TwiBer:  @nickbearmanuk  
    Geographic  Data  Science  Lab  
    Department  of  Geography  and  Planning  

    View Slide