Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open Bibliographic Data and E-LIS: marrying good intentions

Open Bibliographic Data and E-LIS: marrying good intentions

Presentation in the first international workshop for transfer of information for innovation, November 3, 2011, Valencia, Spain


Giannis Tsakonas

November 03, 2011


  1. Open Bibliographic Data and E-LIS marrying good intentions Antonella De

    Robbio Università degli Studi di Padova, Italy, E-LIS Executive Board Giannis Tsakonas University of Patras, Greece, E-LIS Executive Board first international workshop for transfer of information for innovation, November 3, 2011, Valencia, Spain
  2. 2 Background - Exchanging bibliographic datasets is a traditional task

    in the world of libraries. It has mainly the form of collaborative cataloguing. - to avoid duplication of effort - The openness of bibliographic data concerns a number of organizations and individuals, such as libraries and library consortia, indexing services, funding agencies and publishers and many more.
  3. Why open bibliographic data? - Freeing access to bibliographic information.

    - Making bibliographic data dynamic entities. - Identifying quality issues in bibliographic datasets. - Easing publication of small bibliographic datasets, as well as assembling them at will. - Facilitating the uploading of bibliographic data in the Linked Open Data cloud. - Advancing collaboration with other bibliographic organizations and systems. - Transforming bibliographic data to research, by mapping scholarly research and activity. 3
  4. 4 Any complications? - “Closed” attitudes that perceive bibliographic data

    as static property. - Large coordinating organizations that follow rigid models or view the situation in a reluctant fashion. - Cues of provenance that have been lost in the paths of cooperative schemata, such as WorldCat.
  5. 5 Bibliographic data as linked data - We remind that:

    - openness can facilitate the uploading of bibliographic data on the Linked Open Data cloud. - to forward reusability of bibliographic structures, such as catalogues, taxonomies, vocabularies, etc.
  6. 6 Navigation in linked bibliographic data - VIAF: The Virtual

    International Authority File - http://viaf.org/
  7. 7 An example: libris - Libris released the Swedish National

    Bibliography as Linked Data in 2008. - It used related ontologies, such as FOAF for individuals, SKOS for subjects, BibO for book’s parts. - Focused on availability (see “re-usability”) instead perfect representation of the MARC records. - Provides external links to Wikipedia, DBPedia, LC Authorities (names & subjects) and VIAF.
  8. 8 An example: libris - the license - From site:*

    “... We see the investment in Open Data as a strategic one and one that is needed to ensure long term sustainability and competition when it comes to the services needed by libraries and their users as well as the right to control over their collections. The license chosen is CC0 which waives any rights the National Library have over the National Bibliography and the authority data... ” * http://bit.ly/qeGJTb
  9. 9 How we make bibliographic data open? - Today we

    encounter a few noteworthy initiatives for the opening of bibliographic data. - The effort is currently spearheaded by Open Bibliographic Principles, a “product” of the Open Knowledge Foundation. - a hub of software solutions, policy texts, application guidelines, as well as a communication node among interesting parties.
  10. 10 The Open Bibliography ecology

  11. Open Content 11 Open: - Science, - Government & -

    Bibliographic Data - Open Government Data Venn Diagram* by justgrimes * http://www.flickr.com/photos/notbrucelee/5241176871/ Open Science Open Government Open Bibliographic Data Scientific research data Raw data Research output as content Public data: environment, public policies, laws…
  12. 12 - The Open (Knowledge) Definition sets out principles to

    define the ‘open’ in open knowledge. - The term knowledge is used broadly and it includes all forms of data, content such as music, films or books as well any other types of information. Open Definition
  13. 13 - Open bibliographic data can be licensed conforming to

    the Open Definition requirements. In a nutshell Open Definition states: - “A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.” Open Definition’s requirements
  14. 14 Which licenses to apply? - Open or closed licenses?

    - Data and content may have separate rights - We must distinguish between the “Database” and its “Contents” - homogeneous DB (no need to distinguish “Database” & “Contents” ) - non-homogenous DB (need to distinguish “Database and Contents”) - Creative Commons CC Licenses are for content - Open Data Commons are for data
  15. 15 - CC licenses are expressed in 3 different formats:

    - the metadata (machine readable code). - the Commons Deed (human-readable code), - the Legal Code (lawyer-readable code); - The key terms of the core suite of Creative Commons licenses are four - Attribution (Attribution stacking, usually non for LIS papers) - Non-Commercial (What counts as commercial? A&I tasks) - Share alike (Reduces interoperability) - No Derivates (severely restrict use) Creative Commons licenses
  16. 16 CC: six regularly used licenses - Mixing and matching

    these conditions produces sixteen possible combinations - The combination of CC tools by communities is a vast and growing digital commons, a pool of content that can be copied, distributed, edited, remixed, and built upon, all within the boundaries of copyright law. Attribution alone (by) Attribution + Noncommercial (by-nc) Attribution + NoDerivatives (by-nd) Attribution + ShareAlike (by-sa) Attribution + Noncommercial + NoDerivatives (by-nc-nd) Attribution + Noncommercial + ShareAlike (by-nc-sa)
  17. 17 CC PDF Converter - CC PDF Converter is a

    free open source program that allows users to convert documents into PDF files on Microsoft Windows operating systems, while embedding a Creative Commons license, which uses the following open source projects: - Redmon, a port monitor redirector (slightly patched) - GhostScript, used to create PDF files from the print PostScript output (with a minor addition) - libPNG and zlib, to display PNG images and to put license images into the document - XMLite, a simple XML parser by Kyung-min Cho, slightly modified - and SQLite, an lightweight embedded database engine to access the local license database - The CC PDF Converter and its source code is licensed under GPL.
  18. 18 CC tools - Creative Commons Rights Expression Language (CC

    REL) is a specification describing how license information may be described using RDF and how license information may be attached to works. - Besides licenses, CC also offers a way to release material into the public domain through CC0 a legal tool for waiving as many rights as legally possible, worldwide
  19. 19 License Domain By SA Creative Commons Attribution Content Y

    N Creative Commons Attribution Share-Alike Content Y Y Creative Commons CCZero Content, Data N N GNU Free Documentation License Comment: Only conformant subject to certain provisos Content Y Y UK PSI Public Sector Information Content, Data Y N Free Art License Content Y Y MirOS License Code, Content Y N Conformant content licenses
  20. 20 License Domain By SA Open Data Commons Public Domain

    Dedication and Licence (PDDL) Dedicate to the Public Domain (all rights waived) Data N N Open Data Commons Attribution License Attribution for data(bases) Data Y N Open Data Commons Open Database License (ODbL) Attribution-ShareAlike for data(bases) Data Y Y Creative Commons CCZero Dedicate to the Public Domain (all rights waived) Content, Data N N Conformant data licenses
  21. 21 Non-conformant Licenses - Creative Commons No-Derivatives (by-nd-*) violate principle

    3., “Reuse”, as they do not allow works, in part or in whole, to be re-used in derivative works. - Creative Commons NonCommercial licenses (by- nc-*) do not support the Open Knowledge Definition principle 8., “No Discrimination Against Fields of Endeavor”, as they exclude usage in commercial activities.
  22. 22 Stars and clouds - MacKenzie Smith - in the

    frame of the LODLAM initiative - proposed a ranking for the openness of data from informational and cultural organizations, similar to the Linked Data ranking.* ˒˒˒˒ Public Domain (CC0 / ODC PDDL / Public Domain Mark) ˒˒˒ Attribution License (CC-BY / ODC-BY) ˒˒ Attribution License (CC-BY / ODC-BY) - method specified ˒ Attribution Share-Alike License (CC-BY-SA/ODC-ODbL) * http://bit.ly/kPWKHA
  23. 23 The case of E-LIS - In February 2011 the

    Executive Board received an invitation by OKF to endorse OBD. - The invitation was discussed in the EB and it was integrated in the Acropolis Strategy document. - The EB after a thorough discussion decided -in July 2011- to adopt the ODbL license, a one star license. - Therefore E-LIS (meta)data are freely available to anyone, but attribution and share-alike is required. Practically users can: - use the metadata by anyone for any purpose - use the metadata by providing an attribution to E-LIS - and re-distribute the data (as is or in combination to other datasets) in the same sense.
  24. 24 The rationale - E-LIS gets an attribution whenever the

    data is used - protecting and promoting the value added work of our community - E-LIS joins a coalition of other share-alike organizations and institutions. - the share-alike requirement aims at securing an open redistribution. - E-LIS is aware that the share-alike requirement may prove a constrain in extended combination of its data. - however it complies with the least of requirements: it is a clear and explicit statement.
  25. 25 Is E-LIS alone in the OB ecology? - The

    New Zealand National Library provides the national bibliography as MARC/MARCXML sets (approx. 350,000 records) licensed under a Creative Commons Attribution license. - From site:* “... The records were originally created by the National Library of New Zealand, with a small number of contributions from the libraries of New Zealand through Te Puna. Bibliographic records and book cover images used under license by the National Library have been excluded from this dataset release... ” * http://www.natlib.govt.nz/services/data
  26. 26 This and something more - E-LIS publishes JITA, a

    taxonomy of documents in Library and Information Science (currently under revision), as a Linked Open Dataset. - JITA is available through DataHub,* alternatively known as the Comprehensive Knowledge Archive Network, which is a registry of open knowledge datasets and projects. * http://ckan.net/dataset/jita
  27. 27 What now? - check again the “ecology” figure to

    identify where you stand. - find the proper tools that you need to move on to the direction you want - exploit them - check the open bibliographic data guide* to find related use cases - don’t forget national applicability laws http://obd.jisc.ac.uk/
  28. Creative Commons License - Attribution 1.0 Generic Thank you for

    your attention! / ¡Gracias por su atención!