Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Presentation to Google and ESRC for Google Data Analytics Social Science Research Call

Ac36cbdeb128eb88c6bce0ddff38a030?s=47 nickbearman
January 14, 2015
700

Presentation to Google and ESRC for Google Data Analytics Social Science Research Call

Ac36cbdeb128eb88c6bce0ddff38a030?s=128

nickbearman

January 14, 2015
Tweet

Transcript

  1. Dr  Nick  Bearman,  CGeog  (GIS),  AHEA   Department  of  Geography

     and  Planning   CO2  Emissions  and     Home  to  School  Travel   @nickbearmanuk   hDp://www.fotolog.com/luckyshot/41568423/   hDps://twiDer.com/Peter_Tennant/status/444404118330044416  
  2. CO2  Emissions  &  Home  to  School  Travel   •  Project

     overview   •  Findings   •  Challenges   •  OpportuniYes   •  SuggesYons  
  3. Home  to  School  Travel   •  About  7.5m  school  aged

     children  in  England   •  (Most)  have  to  travel  from  home  to  school   D  Sharon  PruiD  hDp://www.flickr.com/photos/pinksherbet/234942843/   hDp://www.flickr.com/photos/bike/8560715649/in/photostream/   hDp://en.wikipedia.org/wiki/ File:Alpine_Travel_bus_DVG517_(YMB_517W)_1981_Bristol_VRT_SL3_ECW,_10_July_2006.jpg  
  4. •  Why  is  it  important?   – AcYve  Transport   – CO2

     emissions  /  air  polluYon   – CongesYon   Data  sources   •  School  Census     – Pupil  home  postcode   – “Usual”  mode  of  travel  (11  opYons)   hDp://chestercycling.files.wordpress.com/ 2011/02/cimg2369.jpg   hDp://staYc2.stuff.co.nz/1333093574/087/6670087.jpg  
  5. TradiYonal  esYmaYon  technique   •  Euclidean  distance     • 

    Straight  lines  will   typically  underesYmate   true  distances     •  Average  emissions   values  for  mode  of   transport     •  No  sensiYvity  to   different  vehicle  types  
  6. •  Street  network  -­‐  OpenStreetMap  &  RouYno   •  Home:

     L11  4SH    School:  L11  0BP    Mode:  WLK   •  Process  was  repeated     for  each  school  child   •  Processing  Yme     ~8.5  days  in  total  
  7. Technical  challenges   •  Large  data  sets   •  MulYple

     rouYng  modes   •  Long  run  Ymes   •  ConfidenYal  data   – Home  postcode   Arne  Hückelheim  (author)  hDp://en.wikipedia.org/wiki/File:SunsetTracksCrop.JPG  
  8. •  Google  Compute  Engine  –  need  data  to  stay  in

      EU,  ideally  UK  –  Data  Protec*on  &  Dept.  of  Edu.   •  Can’t  though  -­‐  “No  guarantee  that  your  data  at   rest  is  kept  only  in  that  region”      (then  15/03/2014)     •  4.1  Data  Storage.  …  may  determine  ..  data  ..   stored  permanently,  at  rest,  in  either  the  United   States  or  the  European  Union….   •  4.2  Transient  Storage.  …  may  be  stored   transiently  or  cached  in  any  country  in  which   Google  or  its  agents  maintain  faciliYes    (now  05/01/2015)   hDps://cloud.google.com/terms/service-­‐terms   hDps://developers.google.com/compute/docs/faq#selectedcountries  
  9. •  Limit  on  Google  RouYng  API   – 2500  within  24

     hours   •  Google,  Bing  and  RouYno  rouYng  soluYons   highly  comparable  
  10. Findings   •  Non  geo  models  underesYmate   emissions  for

     urban  areas   •  Emissions  increased  with     each  academic  year   •  SelecYve  &  religious  schools   have  higher  emissions   •  Can  model  altering  the  %  of     pupils  who  use  acYve  travel  
  11. v   •  d   12032   11966   Bearman,

     N.  &  Singleton,  A.D.   (2014)  Modelling  the  potenYal  impact  on   CO2  emissions  of  an  increased  uptake  of   acYve  travel  for  the  home  to  school   commute  using  individual  level  data,  Journal   of  Transport  &  Health,  1(4)  p.  295–304.   doi:10.1016/j.jth.2014.09.009,  Open  Access  
  12. Challenges   •  Cloud  processing  –   geographic  locaYon  

    •  IntegraYon  of  open  source   and  private  /  confidenYal   data   •  Bus  routes  –  Local  Authority   specific   hDp://www.u.com/cms/s/0/ e2672ccc-­‐349f-­‐11e2-­‐8986-­‐00144feabdc0.html#axzz 2xAmBRk6u  
  13. OpportuniYes   •  Enables  CO2  esYmates  and  policy  targeYng  at

      higher  resoluYon   •  Methodology  developed  can  be  applied  across   many  types  of  transport  and  many  areas  (inc.   health  access)   •  All  code  in  public  domain,  applicaYon  to  assist   this  analysis  at  LA  level  could  be  developed.    
  14. OpportuniYes   •  Data  from  this  project  has  been  

    included  in  the  Transport  Map  Book   •  For  each  Local  Authority  in  England   hDp://www.alex-­‐singleton.com/r/2014/09/09/Transport-­‐Map-­‐Book/   hDp://data.alex-­‐singleton.com/transport/E09000007_Camden.pdf   Camden   325  LAs   58  variables   18,850  total  
  15. SuggesYons   •  Geographic  restricYons  on  cloud  storage  and  

    processing   •  EducaYon  of  data  suppliers  –  they  are  the   gatekeepers   •  Reduce  /  remove  academic  usage  limits   •  Promote  tools  through  academic  channels   “We  provide  digital   soluYons  for  UK  educaYon   and  research”  
  16. QuesYons  ?  

  17. Extra  Slides   •  Presented  at  GISRUK2014  at  the  University

     of   Glasgow  on  18th  April  2014   •  hDp://www.nickbearman.me.uk/2014/04/ modelling-­‐home-­‐to-­‐school-­‐travel-­‐for-­‐state-­‐ pupils-­‐in-­‐england-­‐2008-­‐2011/  
  18. Dr  Nick  Bearman,  CGeog  (GIS)   Department  of  Geography  and

     Planning   Modelling  home  to  school  travel  for   state  pupils  in  England,  2008-­‐2011   @nickbearmanuk   hDp://www.fotolog.com/luckyshot/41568423/   hDps://twiDer.com/Peter_Tennant/status/444404118330044416  
  19. Overview   •  Home  to  school  travel   •  Data,

     technical  challenges     &  modeling  routes   •  Primary  to  Secondary   transiYon  
  20. Home  to  School  Travel   •  About  7.5m  school  aged

     children  in  England   •  (Most)  have  to  travel  from  home  to  school   D  Sharon  PruiD  hDp://www.flickr.com/photos/pinksherbet/234942843/   hDp://www.flickr.com/photos/bike/8560715649/in/photostream/   hDp://en.wikipedia.org/wiki/ File:Alpine_Travel_bus_DVG517_(YMB_517W)_1981_Bristol_VRT_SL3_ECW,_10_July_2006.jpg  
  21. •  What  influences  choice?   – Distance   – Pupil  age  

    – Road  infrastructure   – Family  factors   •  Why  is  it  important?   – AcYve  Transport   – CO2  emissions  /  air  polluYon   – CongesYon   hDp://chestercycling.files.wordpress.com/ 2011/02/cimg2369.jpg   hDp://staYc2.stuff.co.nz/1333093574/087/6670087.jpg  
  22. Data  sources   •  Pupil  Level  Annual  School  Census  (2008-­‐2011)

        – Pupil  home  postcode   – “Usual”  mode  of  travel  (11  opYons)   – Data  for  each  year   – State  schools  only  (Independent  schools  ~  7%)   •  Edubase  -­‐  school  informaYon   •  CO2  emissions  by  travel  mode  
  23. TradiYonal  esYmaYon  technique   •  Euclidean  distance     • 

    Average  emissions   values  for  mode  of   transport     •  Straight  lines  will   typically  underesYmate   true  distances     •  No  sensiYvity  to   different  vehicle  types  
  24. Technical  challenges   •  Large  data  sets   •  MulYple

     rouYng  modes   •  Long  run  Ymes   •  ConfidenYal  data   Arne  Hückelheim  (author)  hDp://en.wikipedia.org/wiki/File:SunsetTracksCrop.JPG  
  25. RouYng:  School  census   •  5m  school  children  with  records

     for  2008-­‐2011   3.8m  (75.9%)   •  with  complete  mode  data   •  Not  outliers   – Tukey  outlier  (Tukey,  1977)   – weighted  by  mode  (m),  year  (a),    local  authority  (g)   – not  too  far  from  staYon   – journey  not  too  long  
  26. RouYng:  Mode  of  travel   •  Home:  L11  4SH  

     School:  L11  0BP    Mode:  WLK   •  Street  network   •  Non-­‐street  network   C.  G.  P.  Grey    hDp://en.wikipedia.org/wiki/File:Citadis_dublin.jpg  
  27. •  Street  network  -­‐  OpenStreetMap  &  RouYno   – Walk,  Cycle,

     Car  (+  Car  Share),  Taxi     – Bus  (public,  school,  unknown)   Text  file:   •  Distance   •  Route  
  28. •  Non-­‐street  network  -­‐  pgRouYng   – Train,  Tram  (Metro  /

     Light  Rail),  Tube   pgRouYng  
  29. Home:  L11  4SH    School:  L11  0BP    Mode:  WLK

      •  R  then  called  RouYno  or  pgRouYng   •  Process  was  repeated  for  each  school  child   •  And  for  each  year  (2008-­‐2011)   – If  either  Mode,  Start  postcode  or  End  postcode   were  different   •  Processing  Yme  ~8.5  days  in  total  
  30. Why  these  programs?   •  RouYno   – Easy  way  of

     geyng  street  based  routes   •  pgRouYng   – Routes  for  custom  networks   •  R  -­‐  Open  Source   – can  handle  big  data  (o/n  running)   •  OS  X  &  R     – good  command  line  interface   – stable  (?)  
  31. Big  Data  and  the  Cloud   •  Run  Yme  an

     issue   •  But  confidenYal  data   •  Cloud  soluYons  complex  for  permissions   •  “Where  is  the  data  stored?”     •  Keeping  locally  is  a  soluYon   hDp://www.u.com/cms/s/0/ e2672ccc-­‐349f-­‐11e2-­‐8986-­‐00144feabdc0.html#axzz2xAmBRk6u  
  32. Primary  to  Secondary  mode  choice   •  Big  change  primary

     to  secondary   – longer  distances   – Bus  travel   – limited  average  change  in  mode  auer  this   hDp://pixabay.com/en/boy-­‐girl-­‐hand-­‐in-­‐hand-­‐kids-­‐ school-­‐160168/  
  33. None
  34. None
  35. Maj.  of  BUS  is  DSB   Secondary:  Walk  to  2.75km

     (1.75m),  then  Bus  (+  Car)   Primary:  Walk  to  1.25km  (0.75m),  then  Car  (+  Bus)   Maj.  of  NON  is  WLK   Distance  
  36. CO2  Emissions   •  Have  routes  for  all  school  traffic

      •  Can  calculate  the  CO2  impact  of  school  traffic     •  Can  model  altering  the  %  of  pupils  who  use   acYve  travel  
  37. Conclusion   •  Home  to  school  travel  is  important  

    •  Can  model  individual  routes  naYonally   •  Large  processing  can  be  done  locally   •  LimitaYons   •  Future  work