Slide 1

Slide 1 text

Hacking  the  Rail   Inges0ng,  analysing  &  visualising  real-­‐0me  streaming   rail  systems  data   Charles  Cai   21  April  2015   Data  Science  London  Meetup     h8p://www.meetup.com/Data-­‐Science-­‐London/events/221885254/      

Slide 2

Slide 2 text

About  me   Hashtag  bio:     #Intrapreneur  #Innovator  #Disruptor  #DataScienFst     #ETRM  (Energy  Trading  &  Risk  Management)  #IB  #FO  #MO  #BO   #BigData  #MachineLearning  #Cloud  #UI  #UX   Twi8er:  @caidong   h8ps://www.linkedin.com/in/charlescai   GitHub:  charles-­‐cai  

Slide 3

Slide 3 text

The  Hackathon  (slide  1)   h8p://hacktrain.com  

Slide 4

Slide 4 text

The  Hackathon  (slide  2)  

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

HackTrain – our challenges for you Incident  dura0on   es0mates     •  Currently  based  on  ‘best  guess’   and  experience   •  Seeking  smart  soluFon   •  Using  historical  data   •  Accuracy  needs  to  improve   •  Incident  ‘types’  are  the  challenge     Journey  check       •  Currently  based  on  ‘best   guess’  and  experience   •  Seeking  smart  soluFon   •  Using  historical  data   •  Accuracy  needs  to  improve   •  Incident  ‘types’  are  the   challenge  

Slide 7

Slide 7 text

Na0onal  Public  Transport  Access  Nodes  (NaPTAN)   NaPTAN  is  Britain's  naFonal  system  for  uniquely  idenFfying  points  of  access  to  public   transport.       It  is  a  core  component  of  the  naFonal  transport  informaFon  infrastructure  and  is  used   by  a  number  of  other  UK  standards  and  informaFon  systems.       There  is  a  NaPTAN  record  for  every  bus  stop,  railway  sta0on,  airport,  ferry  terminal   etc.  in  England,  Scotland  and  Wales.  Record  a8ributes  include  co-­‐ordinates  (OSGR   and  Lat-­‐Long),  NPTG  locality  reference,  name  components  and  SMS  code   200MB  CSV  /  600MB  XML!   h8ps://www.gov.uk/government/publicaFons/naFonal-­‐public-­‐transport-­‐access-­‐node-­‐schema    

Slide 8

Slide 8 text

•  About  us   – We  run,  maintain  and  develop  Britain’s  rail   tracks,  signalling,  bridges,  tunnels,  level   crossings,  viaducts  and  18  key  sta

Slide 9

Slide 9 text

9   Network  Rail  and  Open  Data   Name   Descrip0on   Frequency   BPLAN   Train  planning  data,  including  locaFons  and  secFonal  running  Fmes.   Twice  a  year   Corpus   LocaFon  reference  data.   Monthly   Movement   Train  posiFoning  and  movement  event  data.   Real-­‐0me   RTPPM  (real  0me  public   performance  measure)   Performance  of  trains  against  the  Fmetable,  measured  as  the   percentage  of  trains  arriving  at  their  desFnaFon  on-­‐Fme.   One  Message  /   Minute   SMART   Train  describer  berth  offset  data  used  for  train  reporFng.   Monthly   TD   Train  posiFoning  data  at  signalling  berth  level.   Real-­‐0me   TSR  (Temporary  speed   restric0ons)   Details  of  temporary  reducFons  in  permissible  speed  across  the  rail   network.   Once  a  week  /   Friday  Morning   VSTP  (Very  short  term   plan)   Train  schedules  created  via  the  very  short  term  plan  process  which  are   not  available  via  the  Schedule  feed.   Real-­‐0me  

Slide 10

Slide 10 text

ATOC  brings  together  the  23  train  companies  that   serve  the  length  and  breadth  of  the  UK,  to  preserve   and  enhance  the  benefits  for  passengers  of  Britain’s   naFonal  rail  network.  

Slide 11

Slide 11 text

Name   Descrip0on   Frequency   Timetable  Feed   Full  Timetable  File:  details  of  all  naFonal  rail  passenger  train  services,   CIF  format   Manual  Train  File:  (Z  Trains  File)   Master  StaFons  Names  File:  all  locaFon  specific  data  relevant  to  FTF   Fixed  Link  File  /  Set  File  /  Report  File:  …   Weekly   Fares  Feed   Train  fares,  including  promoFonal  fares,  correcFons,  under  strict  rules   by  government   January,  May   and  September   London  Terminals  Feed     Valid  London  railway  staFon  for  any  fare  adverFsed  with  a  desFnaFon   of  ‘London  Terminals’   -­‐   Avan0x  Fares  Applica0on   h8p://data.atoc.org    

Slide 12

Slide 12 text

RTTI  –  Real  Time  Train  InformaFon   a.k.a  Darwin  

Slide 13

Slide 13 text

•  Darwin  –  a  complex  applicaFon  –  taking  data   from  a  wide  range  of  industry  sources   •  Uses  predicFve  and  heurisFc  technology  to   convert  data  into  useful  predicFons  of  train   running   •  Scheduled  Fmetable  and  movement  data  by   train  company  and  NaFonal  Rail   CommunicaFon  Centre   •  Taking  GPS  data  directly  from  trains  with  Wi-­‐ Fi  +  trains  with  GPS  locators   •  Darwin  CIS  –  Customer   InformaFon  Systems   (completed  by  April  2015)   •  Real  Fme  display   throughout  UK   •  NRE  App   •  NaFonalrail.co.uk   •  NRE  telephone  /  Mobile   channels   *HackTrain:  a  full  copy  of  March  2015  SQL  RTTI   Database  Dump   -­‐  9  Million  messages   -­‐  Half  million  forecast  messages  with  7  reasons     h8p://www.naFonalrail.co.uk/46391.aspx    

Slide 14

Slide 14 text

Iden0fying  Loca0ons:   STANOX  –  StaFon  Number   TIPLOC  –  Timing  Point  LocaFon   NLC  –  NaFonal  LocaFon  Code   3-­‐Alpha  Code  for  CRS  –  Computer  ReservaFon   System  or  NRS  –  NaFonal  ReservaFon  System     •  Knowledgebase  &  XML  Feeds   •  Incident  XML  (Service  DisrupFon)   •  Incident  XML  (Engineering  work)   •  NaFonal  Service  Indicator  (NSI)   •  StaFons,  PromoFons,  Ticket  Types,   TOCs   •  Darwin  Push-­‐Port   •  ConFnuously  streaming  of  train   schedule  +  train  running  predicFons   •  Area  of  interest  /  EnFre  country     •  Extremely  high-­‐volume   •  Darwin  Timetables   •  Schedule  changes  (delta)   h8p://nrodwiki.rockshore.net/index.php/Main_Page    

Slide 15

Slide 15 text

h8ps://github.com/openraildata   Peter  Hicks  from  NRE  manages  the   Github  repo.  Peter  also  manages     h8p://www.openraildata.info       I’ll  upload  my  Darwin  code  to   stomp-­‐client-­‐python  soon.       Please  contact  me  via  TwiXer  /   Github  or  LinkedIn  if  you  are   interested  in  Visualiza0on,   Predic0ons,  and  Mobile  Apps  using   rail  data  as  well  other  data  sets   (e.g.  MET  Weather,  TwiXer  and   Facebook  etc)!    

Slide 16

Slide 16 text

Appendix:  our  team’s  project  in   HackTrain   •  And  you  can  find  more  informaFon  on  the   winning  teams’  work  here:   h8p://hacktrain.com  

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

No content