Slide 1

Slide 1 text

Open Data in 60 minutes https://pietercolpaert.be/#me Ghent University – Guest lecture 2017-05-08

Slide 2

Slide 2 text

Will engineering a better information system help us build a type I civilization? https://en.wikipedia.org/wiki/Kardashev_scale

Slide 3

Slide 3 text

Open Data in the world For example Data Portal from Worldbank http://data.worldbank.org

Slide 4

Slide 4 text

Open Data in Europe Public Sector Information INSPIRE directive PSI directive http://europeandataportal.eu

Slide 5

Slide 5 text

http://www.alterechos.be/bianca-debaets-lopen-data-a-bruxelles-un-potentiel-de-1-500-emplois/

Slide 6

Slide 6 text

Costs Benefits When are you going to reuse open data?

Slide 7

Slide 7 text

Public Sector Information vs. Open Data? PSI Open Data

Slide 8

Slide 8 text

SNCB STIB De Lijn TEC Schedules shared shared shared open Real-time shared shared shared planned Tickets no no yes no Historic no no no open Status of e.g., Public Transit in BE?

Slide 9

Slide 9 text

Open Data vs. data sharing?

Slide 10

Slide 10 text

Sharing data between 2 systems Your system Third party system Agree on a protocol Will determine which questions can be answered in a timely fashion Can ask questions to your system as previously agreed

Slide 11

Slide 11 text

Sharing data on the Web Your system ? ? ? ? ? ? Maximizing reuse → need to raise the interoperability

Slide 12

Slide 12 text

↓ Querying syntactic semantic technical legal When I have got 2 datasets, how easy is it to use them as if they were 1?

Slide 13

Slide 13 text

As a reuser, you need certainty that you won’t get sued https://github.com/iRail/stations

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

OpenDefinition.org ↓ Querying syntactic semantic technical legal

Slide 16

Slide 16 text

Data licenses Interested in the full story? https://pietercolpaert.be/open%20data/2017/02/23/cc0.html

Slide 17

Slide 17 text

A story of raising interoperability ↓ Querying syntactic semantic technical legal When I have 2 datasets, how easy is it to turn them into 1 dataset? → Open Definition & open licenses → The Internet: exchanging data world-wide → JSON, XML, CSV, HTML… Open Standards

Slide 18

Slide 18 text

name type city population StP-Plein Parking Gent 257k { “StP-Plein” : { “type” : “Parking”, “city” : “Gent, “population” : “257k” } } Parking Gent 257k Table / CSV / Spreadsheet JSON XML Serialisations ↓ Querying syntactic semantic technical legal

Slide 19

Slide 19 text

{ “StP-Plein” : { “type” : “Parking”, “city” : “Gent, “population” : “257k” } } name type city population StP-Plein Parking Gent 257k Parking Gent 257k . . “257k” . Table / CSV / Spreadsheet 3 time a datum Triples JSON XML

Slide 20

Slide 20 text

World Wide Web St-P Plein city Gent St Pietersplein type Parking Gent population 257k HTTP Machine 1 HTTP Machine 2 HTTP Machine 3 Thought experiment: decentralizing publishing A user agent visiting each machine knows more than any of the machines independently

Slide 21

Slide 21 text

Problem Sint-Pietersplein is a Parking Site ? ↓ Querying syntactic semantic technical legal

Slide 22

Slide 22 text

Problem Sint-Pietersplein is a Parking Site ? ↓ Querying syntactic semantic technical legal

Slide 23

Slide 23 text

Solution Sint Pietersplein → https://stad.gent/id/parking/P10 is a → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Parking → http://vocab.datex.org/terms#UrbanParkingSite Uniform Resource Identifiers (URIs)

Slide 24

Slide 24 text

A story of raising interoperability ↓ Querying syntactic semantic technical legal When I have 2 datasets, how easy is it to turn them into 1 dataset? → Open Definition & open licenses → The Internet: exchanging data world-wide → JSON, XML, CSV, HTML… Open Standards → Linked Data

Slide 25

Slide 25 text

Open Data is only a legal definition… but: The 5 stars of Linked Open Data http://5stardata.info

Slide 26

Slide 26 text

Sharing data between 2 systems Your system Third party system Agree on a protocol Will determine which questions can be answered in a timely fashion Can ask questions to your system as previously agreed

Slide 27

Slide 27 text

Sharing data on the Web Your system ? ? ? ? ? ? Maximizing reuse → need to raise the interoperability

Slide 28

Slide 28 text

data dump Ask any question How to allow for asking any kind of query? Your system 3d party Your system ? ? ? ? ? ?

Slide 29

Slide 29 text

data dump Ask any question Asking questions Your system 3d party Your system ? ? ? ? ? ? Data publishing: Cacheable Dataset split in fragments

Slide 30

Slide 30 text

A long tail for for e.g., transport data services ... Hard to guess which kind of queries will be needed … More specific features Size of audience Google maps Proximus CityMapper Go-OV Ally Transit App NextTrain smartwatch

Slide 31

Slide 31 text

Proposal http://api.{mycompany}/?from={A}&to={B} &departuretime=2016-10-16T14:45.024Z &wheelchairaccessible=true &transit_modes=plane,railway,bus,car &algoritm_mode=shortest ... Yet this interface will need to answer all questions for all third party apps…

Slide 32

Slide 32 text

data dump Route planning algorithms as a service Asking questions Your system 3d party Your system ? ? ? ? ? ? Does not scale: Extra users comes with extra load Does not give necessary flexibility to companies

Slide 33

Slide 33 text

Discover all the necessary data on the Web Just like websites, we want your data to be high available

Slide 34

Slide 34 text

API fanboys Real data reusers Need Open Data Want services on top of data What we ask data owners

Slide 35

Slide 35 text

What we ask data owners Data dumps Smart servers Data publishing (cheap/reliable) Data services (rather expensive/unreliable) Entire query languages over HTTP Dataset split in fragments Smart agents algorithms as a service Read more at http://linkeddatafragments.org API fanboys Open Data

Slide 36

Slide 36 text

Business model? API fanboys Real data reusers Need Open Data Need services on top of data Business opportunity?

Slide 37

Slide 37 text

Servers publishing Open Data e.g., ● all the planned and actual arrivals and departures ● the network of roads in a certain region worldwide web-services e.g., ● a route planner: from → to ● the closest station to your current location? Scalable businesses $$$ $ $ $$$ end-users

Slide 38

Slide 38 text

We want a world where knowledge creates power for the many, not the few. Looking for a student job in Open Data? Check out http://summerofcode.be