Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Linked Open Data - A Step Towards A Semantic Web

Linked Open Data - A Step Towards A Semantic Web

Presented at Web Camp III

Bebo White

July 16, 2011
Tweet

Other Decks in Technology

Transcript

  1. Bebo White SLAC National Accelerator Laboratory Linked Open Data A

    Step Towards a Semantic Web Web Camp III Stanford July 2011
  2. “Developments in science and information processing have changed the meaning

    of the verb, ‘to know.’ It used to mean ‘having information stored in one’s memory.’ It now means the process of having access to information and knowing how to use it.” ---Herbert Simon
  3. Status of the Semantic Web —  For years we have

    heard about it! Is it real? —  Where are the applications for it? Is there a “killer app?” —  Who is using it? —  What role (if any) will it play in the Future Web? The Semantic Web is alive and well in Linked Open Data (LOD)!
  4. Perceptions of Web Content —  The Web is generally thought

    of being composed of pages, documents —  We have been able to insert some data —  Images <img src=“….”> —  Multimedia —  Web 2.0 mashups provided a new way of thinking about a “Web of Data” but it was awkward to obtain —  APIs —  “Screen-scraping”
  5. The Web of Documents —  Analogy —  A global filesystem

    —  Designed for —  Human consumption —  Primary objects —  Documents (or sub-parts of) —  Links between —  Documents (or sub-parts of) —  Degree of structure in objects —  Fairly low —  Semantics of content and links —  Implicit
  6. The Web of Documents: Issues —  Simplicity —  Loosely structured

    data, untyped links, disconnected data —  Integration —  Show me all the publications from HKU PhD students in Computer Science —  Querying —  Which papers have I written with colleagues outside the US?
  7. The Web of Linked Data —  Analogy —  A global

    database —  Designed for —  Machines first, humans later —  Primary objects —  Things (or descriptions of things) —  Links between —  Things —  Degree of structure in (descriptions of ) things —  High —  Semantics of content and links —  Explicit
  8. Linked Data is —  A way of publishing data on

    the Web that —  Encourages reuse —  Reduces redundancy —  Maximizes its (real and potential) inter-connectedness —  Enables network effects to add value to data
  9. Linked Data Technology Stack —  URIs – Universal Resource Indicators

    —  HTTP – HyperText Transport Protocol —  RDF – Resource Description Framework —  (RDFS/OWL) – RDF Schema/Web Ontology Language
  10. URIs – Not Just for Web Pages —  “A Uniform

    Resource Identifier (URI) provides a simple and extensible means for identifying a resource” – RFC 3986 —  Many different schemes – http://, ftp://, tel:, urn:, mailto: —  Some URIs for “real world” things —  http://www.bebowhite.com/ —  http://dbpedia.org/page/University_of_Hong_Kong —  http://sws.geonames.org/1819729/
  11. HTTP —  Data access mechanism —  Using http:// URIs to

    identify things allows people to reference these things
  12. RDF —  A data format for describing things and their

    interrelationships —  Standardized (XML) —  Easily parsed by machines
  13. FOAF: Friend of a Friend —  An RDF vocabulary for

    describing people: —  Identities —  Interests —  Affiliations —  Social networks —  Etc.
  14. Imagine… —  A “Web” where —  Documents are available for

    download on the Internet —  But there would be no hyperlinks among them
  15. Data on the Web is Not Enough —  Need a

    proper infrastructure for a real Web of Data —  Data is available on the Web —  Accessible via standard Web technologies —  Data are interlinked over the Web —  ie, data can be integrated over the Web —  This is where Semantic Web technologies come in
  16. Simplified Bookstore Data ID Author Title Publisher Year ISBN 3642203914

    id_xyz Social Media Tools and Platforms in Learning Environments id_qpr 2011 ID Name Homepage id_xyz White, Bebo http://www.bebowhite.com ID Publisher’s name City id_qpr Springer New York
  17. Exported Data as a Set of Relations http://…isbn/3642203914 White, Bebo

    http://www.bebowhite.com Social Media Tools… 2011 New York Springer a:title a:year a:city a:p_name a:name a:homepage a:author
  18. Notes on Exporting the Data (1) —  Relations form a

    graph —  The nodes refer to the “real” data or contain some literal —  How the graph is represented in machine is immaterial to the first order
  19. Notes on Exporting the Data (2) —  Data export does

    not necessarily mean physical conversion of the data —  Relations can be generated on-the-fly at query time —  Via SQL “bridges” —  Scraping HTML pages —  Extracting data from Excel sheets —  etc. —  One can export part of the data
  20. RDF Triples (1) —  Formalize the data about the book

    —  We “connected” the data… —  But a simple connection is not enough… data should be named somehow —  Hence the RDF Triples: a labelled connection between two resources
  21. RDF Triples (2) •  An RDF Triple (s,p,o) is such

    that: •  “s”, “p” are URI-s, ie, resources on the Web; “o” is a URI or a literal •  “s”, “p”, and “o” stand for “subject”, “property”, and “object” •  here is the complete triple: (<http://…isbn…6682>, <http://…/original>, <http://…isbn…409X>) •  RDF is a general model for such triples •  With machine readable formats like RDF/XML, Turtle, N3, RDFa, …
  22. RDF Triples (3) —  Resources can use any URI — 

    http://www.example.org/file.html#home —  http://www.example.org/file2.xml#xpath(//q[@a=b]) —  http://www.example.org/form?a=b&c=d —  RDF triples form a directed, labeled graph (the best way to think about them!)
  23. A Simple RDF Example (in RDF/XML) <rdf:Description rdf:about="http://…/isbn/2020386682"> <f:titre xml:lang="fr”>Outils

    de Medias Sociaux</f:titre> <f:original rdf:resource="http://…/isbn/3642203914"/> </rdf:Description> (Note: namespaces are used to simplify the URI-s) http://…isbn/2020386682 Outils de Medias Sociaux… http://…isbn/ 3642203914
  24. RDF in Programming Practice —  For example, using Java+Jena (HP’s

    Bristol Lab): —  A “Model” object is created —  The RDF file is parsed and results stored in the Model —  The Model offers methods to retrieve: —  triples —  (property,object) pairs for a specific subject —  (subject,property) pairs for specific object —  etc. —  The rest is conventional programming… —  Similar tools exist in Python, PHP, etc.
  25. The Rough Structure of Data Integration —  Map the various

    data onto an abstract data representation —  Make the data independent of its internal representation… —  Merge the resulting representations —  Start making queries on the whole! —  Queries not possible on the individual data sets
  26. Data Merging with RDF —  Mix schemas/vocabularies within one document

    —  Less painful data merging —  Mashups that work the way they’re supposed to!
  27. Linked Data Principles —  Use URIs as names of things

    —  Anything, not just documents —  You are not your homepage —  Information resources and non-information resources —  Use HTTP URIs —  Globally unique names, distributed ownership —  Allows people to look up those names —  Provide useful information in RDF —  When someone looks up a URI —  Include RDF links to other URIs —  To enable discovery of related information
  28. Why Publish Linked Data? —  Ease of discovery —  Ease

    of consumption —  Standards-based data sharing —  Reduced redundancy —  Added value —  Build ecosystems around your data/content
  29. The Linking Open Data Project —  Community project with W3C

    support —  Take existing open data sets —  Make them available on the Web in RDF —  Interlink them with other data sets —  Began in early 2007
  30. Why is Linked Open Data Important? —  Because in many

    cases it’s our data! —  Efficiency, reducing redundancy —  Promotes a digital society —  Opens the door to data innovation and discovery —  Holds the promise of creating from data —  Knowledge —  Wisdom —  Benefit for all