Public Transport Route Planning over Lightweight Linked Data Interfaces

Public Transport Route Planning over Lightweight Linked Data Interfaces

Research presentation on how to publish transport data for maximum reuse.

Presented at ICWE2017

25b6db9c0680e598186d819051ad9e4b?s=128

Pieter Colpaert

June 08, 2017
Tweet

Transcript

  1. pietercolpaert.be linkedconnections.org Public Transit Route Planning over Lightweight Linked Data

    Interfaces Pieter Colpaert, Ruben Verborgh, Erik Mannens
  2. pietercolpaert.be linkedconnections.org What data would you need?

  3. pietercolpaert.be linkedconnections.org Content 1. State of the art in sharing

    data on the Web for route planners 2. Proposal: let’s fragment our dataset instead 3. Evaluation design: replaying real query logs on different set-ups 4. Results: user perceived performance and cost-efficiency 5. Conclusion: new valuable trade-off established
  4. pietercolpaert.be linkedconnections.org Option 1 Cheap publishing solution Straight-forward to keep

    online Publishing a GTFS data dump High effort required from user agents User agents need to import your data over and over again zip-file containing CSV files
  5. pietercolpaert.be linkedconnections.org http://api.{mycompany}/?from={A}&to={B} &departuretime=2017-06-08T14:45.024Z &wheelchairaccessible=true &transit_modes=plane,railway,bus,car &algoritm_mode=shortest ... Option 2

    a web-service e.g., a plain old HTTP+JSON API
  6. pietercolpaert.be linkedconnections.org

  7. pietercolpaert.be linkedconnections.org Let’s Web-engineer route planning! REST for a high

    user perceived performance, caching and cost-efficiency Hypermedia for enabling intelligent agents Linked Data for semantic interoperability
  8. pietercolpaert.be linkedconnections.org Can we decouple data publishing from the execution

    of the algorithm?
  9. pietercolpaert.be linkedconnections.org Let’s take a look at the data a

    connection departureTime + departureStop arrivalTime + arrivalStop another connection departureTime + departureStop arrivalTime + arrivalStop
  10. pietercolpaert.be linkedconnections.org time Connection Scan Algorithm ~ creating a minimum

    spanning tree through a sorted directed acyclic graph Squares are connections
  11. pietercolpaert.be linkedconnections.org Resource X Resource ... Resource 2 Resource 1

    time hydra:next hydra:next X requests needed instead of just 1
  12. pietercolpaert.be linkedconnections.org Let your browser find a route for you

    LinkedConnections.org
  13. pietercolpaert.be linkedconnections.org Is this new trade-off more cost-efficient for the

    data publisher? How much slower is it for the data reuser? Evaluation
  14. pietercolpaert.be linkedconnections.org Three set-ups 1. A query server Real cost-efficiency

    of the Linked Connections interface will be found in-between 2. Linked Connections with always unique user agents 3. Linked Connections with only one user agent over the entire Web
  15. pietercolpaert.be linkedconnections.org Three set-ups 1. A query server Real cost-efficiency

    of the Linked Connections interface will be found in-between 2 and 3 2. Linked Connections server + client 3. Linked Connections server + client with cache https://api.{myapp}/ ?from={A}&to={B} https://{myhost}/{datafragmentid} MongoDB with connections server client server server client client cache
  16. pietercolpaert.be linkedconnections.org source: https://api.irail.be/logs 2. Real Query Logs 1. Real

    schedules Open research data source: https://gtfs.irail.be/ https://github.com/ linkedconnections/ benchmark-belgianrail
  17. pietercolpaert.be linkedconnections.org Results 1. CPU time on the server 2.

    Average time spent by the client per connection = an indication of the user perceived performance
  18. pietercolpaert.be linkedconnections.org CPU time on the server Linked Connections has

    definitely a more lightweight interface Real world between these 2 values
  19. pietercolpaert.be linkedconnections.org Average time spent by the client per connection

    Under low load, Linked Connections is slightly slower, yet under high load, Linked Connections gives better response times Real world between these 2 values
  20. pietercolpaert.be linkedconnections.org Non measured benefits User profile only on your

    smartphone → Privacy by design Combining it with other datasets becomes easy Route planning becomes merely adding a library to your software project “happiest route” by @danielequercia
  21. pietercolpaert.be linkedconnections.org Federated route planning becomes straightforward

  22. pietercolpaert.be linkedconnections.org “Transfers” are now a semantic interoperability problem A

    problem we can solve with Linked Data Connection A departureTime T1 departureStop S1 arrivalTime T2 arrivalStop S2 Rail Station S2 longitude X1 latitude Y1 name ... ParentStation Station S3 As S2 becomes reachable, others stops become reachable as well: nearby Bus Stop S4 Bus Stop S5 ... has parent stop
  23. pietercolpaert.be linkedconnections.org Conclusion New trade-off established for cost-efficiently maximizing possible

    reuse of public transport data Data dumps Linked Connections Answer any question on the server Route planning algorithms as a service Data publishing Data services http://api.{myapp}/?from={A}&to={B} http://{myhost}/{datafragmentid} Average cache hit-rate of 78%