Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Presentation to Google and ESRC for Google Data Analytics Social Science Research Call

nickbearman
January 14, 2015
720

Presentation to Google and ESRC for Google Data Analytics Social Science Research Call

nickbearman

January 14, 2015
Tweet

More Decks by nickbearman

Transcript

  1. Dr  Nick  Bearman,  CGeog  (GIS),  AHEA  
    Department  of  Geography  and  Planning  
    CO2
     Emissions  and    
    Home  to  School  Travel  
    @nickbearmanuk  
    hDp://www.fotolog.com/luckyshot/41568423/  
    hDps://twiDer.com/Peter_Tennant/status/444404118330044416  

    View full-size slide

  2. CO2
     Emissions  &  Home  to  School  Travel  
    •  Project  overview  
    •  Findings  
    •  Challenges  
    •  OpportuniYes  
    •  SuggesYons  

    View full-size slide

  3. Home  to  School  Travel  
    •  About  7.5m  school  aged  children  in  England  
    •  (Most)  have  to  travel  from  home  to  school  
    D  Sharon  PruiD  hDp://www.flickr.com/photos/pinksherbet/234942843/   hDp://www.flickr.com/photos/bike/8560715649/in/photostream/  
    hDp://en.wikipedia.org/wiki/
    File:Alpine_Travel_bus_DVG517_(YMB_517W)_1981_Bristol_VRT_SL3_ECW,_10_July_2006.jpg  

    View full-size slide

  4. •  Why  is  it  important?  
    – AcYve  Transport  
    – CO2
     emissions  /  air  polluYon  
    – CongesYon  
    Data  sources  
    •  School  Census    
    – Pupil  home  postcode  
    – “Usual”  mode  of  travel  (11  opYons)  
    hDp://chestercycling.files.wordpress.com/
    2011/02/cimg2369.jpg  
    hDp://staYc2.stuff.co.nz/1333093574/087/6670087.jpg  

    View full-size slide

  5. TradiYonal  esYmaYon  technique  
    •  Euclidean  distance    
    •  Straight  lines  will  
    typically  underesYmate  
    true  distances    
    •  Average  emissions  
    values  for  mode  of  
    transport    
    •  No  sensiYvity  to  
    different  vehicle  types  

    View full-size slide

  6. •  Street  network  -­‐  OpenStreetMap  &  RouYno  
    •  Home:  L11  4SH    School:  L11  0BP    Mode:  WLK  
    •  Process  was  repeated    
    for  each  school  child  
    •  Processing  Yme    
    ~8.5  days  in  total  

    View full-size slide

  7. Technical  challenges  
    •  Large  data  sets  
    •  MulYple  rouYng  modes  
    •  Long  run  Ymes  
    •  ConfidenYal  data  
    – Home  postcode  
    Arne  Hückelheim  (author)  hDp://en.wikipedia.org/wiki/File:SunsetTracksCrop.JPG  

    View full-size slide

  8. •  Google  Compute  Engine  –  need  data  to  stay  in  
    EU,  ideally  UK  –  Data  Protec*on  &  Dept.  of  Edu.  
    •  Can’t  though  -­‐  “No  guarantee  that  your  data  at  
    rest  is  kept  only  in  that  region”    
     (then  15/03/2014)  
     
    •  4.1  Data  Storage.  …  may  determine  ..  data  ..  
    stored  permanently,  at  rest,  in  either  the  United  
    States  or  the  European  Union….  
    •  4.2  Transient  Storage.  …  may  be  stored  
    transiently  or  cached  in  any  country  in  which  
    Google  or  its  agents  maintain  faciliYes  
     (now  05/01/2015)   hDps://cloud.google.com/terms/service-­‐terms  
    hDps://developers.google.com/compute/docs/faq#selectedcountries  

    View full-size slide

  9. •  Limit  on  Google  RouYng  API  
    – 2500  within  24  hours  
    •  Google,  Bing  and  RouYno  rouYng  soluYons  
    highly  comparable  

    View full-size slide

  10. Findings  
    •  Non  geo  models  underesYmate  
    emissions  for  urban  areas  
    •  Emissions  increased  with    
    each  academic  year  
    •  SelecYve  &  religious  schools  
    have  higher  emissions  
    •  Can  model  altering  the  %  of    
    pupils  who  use  acYve  travel  

    View full-size slide

  11. v  
    •  d  
    12032  
    11966  
    Bearman,  N.  &  Singleton,  A.D.  
    (2014)  Modelling  the  potenYal  impact  on  
    CO2
     emissions  of  an  increased  uptake  of  
    acYve  travel  for  the  home  to  school  
    commute  using  individual  level  data,  Journal  
    of  Transport  &  Health,  1(4)  p.  295–304.  
    doi:10.1016/j.jth.2014.09.009,  Open  Access  

    View full-size slide

  12. Challenges  
    •  Cloud  processing  –  
    geographic  locaYon  
    •  IntegraYon  of  open  source  
    and  private  /  confidenYal  
    data  
    •  Bus  routes  –  Local  Authority  
    specific  
    hDp://www.u.com/cms/s/0/
    e2672ccc-­‐349f-­‐11e2-­‐8986-­‐00144feabdc0.html#axzz
    2xAmBRk6u  

    View full-size slide

  13. OpportuniYes  
    •  Enables  CO2
     esYmates  and  policy  targeYng  at  
    higher  resoluYon  
    •  Methodology  developed  can  be  applied  across  
    many  types  of  transport  and  many  areas  (inc.  
    health  access)  
    •  All  code  in  public  domain,  applicaYon  to  assist  
    this  analysis  at  LA  level  could  be  developed.    

    View full-size slide

  14. OpportuniYes  
    •  Data  from  this  project  has  been  
    included  in  the  Transport  Map  Book  
    •  For  each  Local  Authority  in  England  
    hDp://www.alex-­‐singleton.com/r/2014/09/09/Transport-­‐Map-­‐Book/  
    hDp://data.alex-­‐singleton.com/transport/E09000007_Camden.pdf  
    Camden  
    325  LAs  
    58  variables  
    18,850  total  

    View full-size slide

  15. SuggesYons  
    •  Geographic  restricYons  on  cloud  storage  and  
    processing  
    •  EducaYon  of  data  suppliers  –  they  are  the  
    gatekeepers  
    •  Reduce  /  remove  academic  usage  limits  
    •  Promote  tools  through  academic  channels  
    “We  provide  digital  
    soluYons  for  UK  educaYon  
    and  research”  

    View full-size slide

  16. QuesYons  ?  

    View full-size slide

  17. Extra  Slides  
    •  Presented  at  GISRUK2014  at  the  University  of  
    Glasgow  on  18th  April  2014  
    •  hDp://www.nickbearman.me.uk/2014/04/
    modelling-­‐home-­‐to-­‐school-­‐travel-­‐for-­‐state-­‐
    pupils-­‐in-­‐england-­‐2008-­‐2011/  

    View full-size slide

  18. Dr  Nick  Bearman,  CGeog  (GIS)  
    Department  of  Geography  and  Planning  
    Modelling  home  to  school  travel  for  
    state  pupils  in  England,  2008-­‐2011  
    @nickbearmanuk  
    hDp://www.fotolog.com/luckyshot/41568423/  
    hDps://twiDer.com/Peter_Tennant/status/444404118330044416  

    View full-size slide

  19. Overview  
    •  Home  to  school  travel  
    •  Data,  technical  challenges    
    &  modeling  routes  
    •  Primary  to  Secondary  
    transiYon  

    View full-size slide

  20. Home  to  School  Travel  
    •  About  7.5m  school  aged  children  in  England  
    •  (Most)  have  to  travel  from  home  to  school  
    D  Sharon  PruiD  hDp://www.flickr.com/photos/pinksherbet/234942843/   hDp://www.flickr.com/photos/bike/8560715649/in/photostream/  
    hDp://en.wikipedia.org/wiki/
    File:Alpine_Travel_bus_DVG517_(YMB_517W)_1981_Bristol_VRT_SL3_ECW,_10_July_2006.jpg  

    View full-size slide

  21. •  What  influences  choice?  
    – Distance  
    – Pupil  age  
    – Road  infrastructure  
    – Family  factors  
    •  Why  is  it  important?  
    – AcYve  Transport  
    – CO2
     emissions  /  air  polluYon  
    – CongesYon  
    hDp://chestercycling.files.wordpress.com/
    2011/02/cimg2369.jpg  
    hDp://staYc2.stuff.co.nz/1333093574/087/6670087.jpg  

    View full-size slide

  22. Data  sources  
    •  Pupil  Level  Annual  School  Census  (2008-­‐2011)    
    – Pupil  home  postcode  
    – “Usual”  mode  of  travel  (11  opYons)  
    – Data  for  each  year  
    – State  schools  only  (Independent  schools  ~  7%)  
    •  Edubase  -­‐  school  informaYon  
    •  CO2
     emissions  by  travel  mode  

    View full-size slide

  23. TradiYonal  esYmaYon  technique  
    •  Euclidean  distance    
    •  Average  emissions  
    values  for  mode  of  
    transport    
    •  Straight  lines  will  
    typically  underesYmate  
    true  distances    
    •  No  sensiYvity  to  
    different  vehicle  types  

    View full-size slide

  24. Technical  challenges  
    •  Large  data  sets  
    •  MulYple  rouYng  modes  
    •  Long  run  Ymes  
    •  ConfidenYal  data  
    Arne  Hückelheim  (author)  hDp://en.wikipedia.org/wiki/File:SunsetTracksCrop.JPG  

    View full-size slide

  25. RouYng:  School  census  
    •  5m  school  children  with  records  for  2008-­‐2011  
    3.8m  (75.9%)  
    •  with  complete  mode  data  
    •  Not  outliers  
    – Tukey  outlier  (Tukey,  1977)  
    – weighted  by  mode  (m),  year  (a),    local  authority  (g)  
    – not  too  far  from  staYon  
    – journey  not  too  long  

    View full-size slide

  26. RouYng:  Mode  of  travel  
    •  Home:  L11  4SH    School:  L11  0BP    Mode:  WLK  
    •  Street  network  
    •  Non-­‐street  network  
    C.  G.  P.  Grey    hDp://en.wikipedia.org/wiki/File:Citadis_dublin.jpg  

    View full-size slide

  27. •  Street  network  -­‐  OpenStreetMap  &  RouYno  
    – Walk,  Cycle,  Car  (+  Car  Share),  Taxi    
    – Bus  (public,  school,  unknown)  
    Text  file:  
    •  Distance  
    •  Route  

    View full-size slide

  28. •  Non-­‐street  network  -­‐  pgRouYng  
    – Train,  Tram  (Metro  /  Light  Rail),  Tube  
    pgRouYng  

    View full-size slide

  29. Home:  L11  4SH    School:  L11  0BP    Mode:  WLK  
    •  R  then  called  RouYno  or  pgRouYng  
    •  Process  was  repeated  for  each  school  child  
    •  And  for  each  year  (2008-­‐2011)  
    – If  either  Mode,  Start  postcode  or  End  postcode  
    were  different  
    •  Processing  Yme  ~8.5  days  in  total  

    View full-size slide

  30. Why  these  programs?  
    •  RouYno  
    – Easy  way  of  geyng  street  based  routes  
    •  pgRouYng  
    – Routes  for  custom  networks  
    •  R  -­‐  Open  Source  
    – can  handle  big  data  (o/n  running)  
    •  OS  X  &  R    
    – good  command  line  interface  
    – stable  (?)  

    View full-size slide

  31. Big  Data  and  the  Cloud  
    •  Run  Yme  an  issue  
    •  But  confidenYal  data  
    •  Cloud  soluYons  complex  for  permissions  
    •  “Where  is  the  data  stored?”    
    •  Keeping  locally  is  a  soluYon  
    hDp://www.u.com/cms/s/0/
    e2672ccc-­‐349f-­‐11e2-­‐8986-­‐00144feabdc0.html#axzz2xAmBRk6u  

    View full-size slide

  32. Primary  to  Secondary  mode  choice  
    •  Big  change  primary  to  secondary  
    – longer  distances  
    – Bus  travel  
    – limited  average  change  in  mode  auer  this  
    hDp://pixabay.com/en/boy-­‐girl-­‐hand-­‐in-­‐hand-­‐kids-­‐
    school-­‐160168/  

    View full-size slide

  33. Maj.  of  BUS  is  DSB  
    Secondary:  Walk  to  2.75km  (1.75m),  then  Bus  (+  Car)  
    Primary:  Walk  to  1.25km  (0.75m),  then  Car  (+  Bus)   Maj.  of  NON  is  WLK  
    Distance  

    View full-size slide

  34. CO2
     Emissions  
    •  Have  routes  for  all  school  traffic  
    •  Can  calculate  the  CO2
     impact  of  school  traffic    
    •  Can  model  altering  the  %  of  pupils  who  use  
    acYve  travel  

    View full-size slide

  35. Conclusion  
    •  Home  to  school  travel  is  important  
    •  Can  model  individual  routes  naYonally  
    •  Large  processing  can  be  done  locally  
    •  LimitaYons  
    •  Future  work  

    View full-size slide