Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data for maximum reuse at Solvay Brussels School

Data for maximum reuse at Solvay Brussels School

Open Data is only a legal definition. The goal behind Open Data however is to maximize the reuse. In this talk I explain the theory of maximizing the interoperability between open datasets and hint towards possible business models today upon Open Data

Pieter Colpaert

March 17, 2017
Tweet

More Decks by Pieter Colpaert

Other Decks in Technology

Transcript

  1. Data for maximum reuse @pietercolpaert Trying to maximise the reuse

    of your datasets Reusing open data to enrich your own business model
  2. Open Data in the world For example Data Portal from

    Worldbank http://data.worldbank.org
  3. SNCB STIB De Lijn TEC Schedules shared shared shared open

    Real-time planned shared shared planned Tickets no no yes no Historic no no no open Status of e.g., Public Transit in BE?
  4. Sharing data between 2 systems Your system Third party system

    Agree on a protocol Will determine which questions can be answered in a timely fashion Can ask questions to your system as previously agreed
  5. Sharing data on the Web Your system ? ? ?

    ? ? ? Maximizing reuse → need to raise the interoperability
  6. ↓ Querying syntactic semantic technical legal When I have got

    2 datasets, how easy is it to use them as if they were 1?
  7. As a reuser, you need certainty that you won’t get

    sued https://github.com/iRail/stations
  8. A story of raising interoperability ↓ Querying syntactic semantic technical

    legal When I have 2 datasets, how easy is it to turn them into 1 dataset? → Open Definition & open licenses → The Internet: exchanging data world-wide → JSON, XML, CSV, HTML… Open Standards
  9. name type same as location imec company iMinds X {

    “imec” : { “type” : “company”, “same as” : “iMinds, “location” : “X” } } <imec> <type>company</type> <sameas>iMinds</sameas> <location> X </location> </imec> Table / CSV / Spreadsheet JSON XML Serialisations
  10. name type same as location imec company iMinds X <imec>

    <type> <company> . <imec> <sameas> <iMinds> . <imec> <location> “X” . Table / CSV / Spreadsheet triples Triple structure <imec> <type>company</type> <sameas>iMinds</sameas> <location> X </location> </imec> { “iMinds” : { “type” : “company”, “same as” : “iMinds”, “location” : “X” } } JSON XML
  11. World Wide Web imec same as iMinds imec is a

    company iMinds located at X Machine 1 Machine 2 Machine 3 Linked data Solving semantic interoperability?
  12. A story of raising interoperability ↓ Querying syntactic semantic technical

    legal When I have 2 datasets, how easy is it to turn them into 1 dataset? → Open Definition & open licenses → The Internet: exchanging data world-wide → JSON, XML, CSV, HTML… Open Standards → Linked Data: work in progress
  13. Open Data is only a legal definition… but: The 5

    stars of Linked Open Data 5stardata.info
  14. Sharing data between 2 systems Your system Third party system

    Agree on a protocol Will determine which questions can be answered in a timely fashion Can ask questions to your system as previously agreed
  15. Sharing data on the Web Your system ? ? ?

    ? ? ? Maximizing reuse → need to raise the interoperability
  16. data dump Ask any question How to allow for asking

    any kind of query? Your system 3d party Your system ? ? ? ? ? ?
  17. data dump Ask any question Asking questions Your system 3d

    party Your system ? ? ? ? ? ? Data publishing: Scalable Every request is cacheable Dataset split in fragments
  18. A long tail for for e.g., transport data services ...

    Hard to guess which kind of queries will be needed … More specific features Size of audience Google maps Proximus CityMapper Go-OV Ally Transit App NextTrain smartwatch
  19. data dump Route planning algorithms as a service Asking questions

    Your system 3d party Your system ? ? ? ? ? ? Does not scale: Extra users comes with extra load Does not give necessary flexibility to companies
  20. Discover all the necessary data on the Web Just like

    websites, we want your data to be high available
  21. API fanboys Real data reusers Need Open Data Want services

    on top of data What we ask data owners
  22. What we ask data owners Data dumps Smart servers Data

    publishing (cheap/reliable) Data services (rather expensive/unreliable) Entire query languages over HTTP Dataset split in fragments Smart agents algorithms as a service Read more at http://linkeddatafragments.org API fanboys Open Data
  23. Business model? API fanboys Real data reusers Need Open Data

    Need services on top of data Business opportunity?
  24. Servers publishing Open Data e.g., • all the planned and

    actual arrivals and departures • the network of roads in a certain region worldwide web-services e.g., • a route planner: from → to • the closest station to your current location? Scalable businesses $$$ $ $ $$$ end-users