Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction slides to Linked Open Data

Introduction slides to Linked Open Data

My latest slides to explain Linked Open Data publishing and reuse, explaining data, open data, linked data and linked open data.

Feel free to reuse my slides (I'd be honoured if you do), under CC BY SA.

Pieter Colpaert

March 10, 2015
Tweet

More Decks by Pieter Colpaert

Other Decks in Technology

Transcript

  1. 1. The basics Data → Open Data → Linked Data

    2. Linked Open Data How to publish data? Programme
  2. Data Wikipedia says: English (disambiguation): data is uninterpreted information English

    (computing): is any sequence of symbols given meaning by specific acts of interpretation. Dutch: data is the plural of datum, which is an observation of a fact
  3. ↓ Querying syntactic object semantic technical legal process Would the

    data governance be able to be merged? Are you legally allowed to merge 2 datasets? Can you connect the communication channels? e.g., merge a dataset published as a CD with a dataset published using floppy disk How easy is it to ask certain questions over the borders of the dataset? What’s the interoperability of the serialisation formats? E.g., JSON vs. PDF? What can you request to the server? Do the words in the one dataset mean the same as the words in the other?
  4. reuse is allowed Data on the web reuse in a

    gray zone unauthorised reuse
  5. How can we find open data? It’s made available through

    open data portals http://data.gov.uk, http://datahub.io, http://open-data.europa.eu, http://data.gent.be, … Via links in existing datasets e.g., http://dbpedia.org/resource/Ghent
  6. Linked Data Because it is impossible to store all the

    world’s knowledge on one machine
  7. name type same as location iMinds company IBBT Gaston Crommenlaan

    8 { “iMinds” : { “type” : “company”, “same as” : “IBBT, “location” : “Gaston Crommenlaan 8” } } <iMinds> <type>company</type> <sameas>IBBT</sameas> <location> Gaston Crommenlaan 8 </location> </iMinds> Table / CSV / Spreadsheet JSON XML Serialisations
  8. name type same as location iMinds company IBBT Gaston Crommenlaan

    8 <iMinds> <type> <company> . <iMinds> <sameas> <IBBT> . <iMinds> <vestiging> “Gaston Crommenlaan 8” . Table / CSV / Spreadsheet triples Triple structuur { “iMinds” : { “type” : “company”, “same as” : “IBBT, “location” : “Gaston Crommenlaan 8” } } <iMinds> <type>company</type> <sameas>IBBT</sameas> <location> Gaston Crommenlaan 8 </location> </iMinds> JSON XML
  9. World Wide Web iMinds same as IBBT iMinds is a

    company IBBT located at Gaston Crommenlaan 8 Machine 1 Machine 2 Machine 3 Linked data
  10. Problem The word company is ambiguous. How can we make

    sure that machines understand each other? semantic interoperability What about “is a”? and what about “iMinds”?
  11. Solution iMinds → http://data.kbodata.be/organisation/0866_386_380#id is a → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Company →

    http://www.w3.org/ns/regorg#RegisteredOrganization Uniform Resource Identifiers (URI’s) een triple = is an atomary piece of data (a datum or a fact) that cannot be misunderstood on machine-level in a Web context
  12. iMinds compa ny is a iMinds → http://data.kbodata.be/organisation/0866_386_380#id is a

    → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Company → http://www.w3.org/ns/regorg#RegisteredOrganization
  13. Summary New terms: data quality, data interoperability, triples, open data,

    linked open data cloud Linked Open Data means: making your data more interoperable with other datasets on the web by using URIs as identifiers and triples as atomary building blocks
  14. Linked Data principles 1. Use URIs as names for things

    2. Use HTTP URIs so that people can look up those names 3. When looking up URIs, provide useful information 4. Include links to other URIs for discoverability Only important if you’re defining new URIs Not important if you’re publishing facts by reusing identifiers
  15. E.g., I’m launching a new company {mynewcompany} → http://{mynewcompany}.be/#org is

    a → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Company → http://www.w3.org/ns/regorg#RegisteredOrganization An identifier for your company. The semantics are controlled by you
  16. E.g., I’m launching a new company {mynewcompany} → http://{mynewcompany}.be/#org is

    a → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Company → http://www.w3.org/ns/regorg#RegisteredOrganization {mynewcompany} → http://{mynewcompany}.be/#org has a homepage → http://xmlns.com/foaf/0.1/homepage http://{mynewcompany}.be/
  17. Publishing methods 1. Data dumps 2. Triples within HTML pages

    3. JSON → JSON-LD web services 4. Triple pattern fragments A couple of examples
  18. http://wiki.dbpedia.org/Downloads2014 → all facts in 1 file Data dumps 1

    big file with a list of triples Pro: can be imported without trouble within your own system Contra: hard to keep up to date
  19. JSON API Old manner: GET a URL, Data is without

    further ado available for your app Pro: easy to refresh → real-time data http://{address to API document on Empire State}
  20. JSON-LD API Add “context”: Each word is mapped to a

    URI Each fact becomes a triple Pro: API responses are disambiguated Pro: 2 APIs with a similar context are semantically interoperable
  21. Triple Pattern Fragments server allow basic questions ?subject → ?predicate

    → ?object iMinds → is a → company Pro: allow apps to ask complex queries over the Web of data e.g., http://fragments.dbpedia.org