Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Moving from MARC: How BIBFRAME moves the Linked Data in Libraries conversation to large-scale action

December 03, 2014

Moving from MARC: How BIBFRAME moves the Linked Data in Libraries conversation to large-scale action

Presenter: Eric Miller (Zepheira)

As the Library of Congress looked to the future of MARC, they looked to Linked Data principles and Semantic Web standards as the foundation of BIBFRAME. Libraries have an extensive history with MARC as a sophisticated and highly customized descriptive vocabulary with billions of records spread across systems and providers. In order to recognize the value of connecting this legacy in new and contemporary ways, BIBFRAME’s design is intentionally extensible with Profile-based vocabularies, flexible transformation utilities, and iterative linking strategies in mind. The migration from MARC (and other related Library standards) to BIBFRAME offers the most widely actionable opportunity for libraries to adopt Linked Data as a foundation of their Web visibly and internal operations.
This session will include a review of practical tools we have used in helping libraries:
• evaluate their current data
• define local data priorities
• perform large-scale transformation
• create profile-based definitions for original content
• identify linking options
• move beyond simply representing legacy data to take full advantage of the Linked Data nature of Web vocabularies like BIBFRAME and schema.org.
We benefit from looking back at the history of how Libraries have helped shape the Web of Data to the future of how now given these standards, we together can raise the visibility of libraries on the Web.


December 03, 2014

More Decks by SWIB14

Other Decks in Technology


  1. Moving from MARC: How BIBFRAME moves the Linked Data in

    Libraries conversation to large-scale action SWIB 2014 Semantic Web in Libraries December, 3, 2014 Bonn, Germany Eric Miller em@zepheira.com https://www.linkedin.com/in/erimille @erimille
  2. I believe that everyone benefits from the visibility of libraries

    and their content on the Web.
  3. Extremely encouraged • Tom Grahame - value proposition of “one

    page per thing” • Lessons learned from Europeana - “quantity has a quality all its own” • D-SWARM: middleware designed to empower domain experts (librarians) • Aliada - accelerate the publication of Library data in the Linked Open Data • Dan Scott - speak in the way the web understands • Richard Wallis - things not strings • #swib14
  4. A talk in 3 acts

  5. Act 1 : Context

  6. Minimum Viable Product, 
 Incremental Value, and Continuous Learning

  7. RDF 7

  8. hasInstance creator subject publisher publishedAt format Work Instance Authority Authority

    Authority Authority Authority
  9. BIBFRAME Vocabulary 9

  10. Linked Data "a term used to describe a recommended best

    practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." http://en.wikipedia.org/wiki/Linked_Data 10
  11. General Technology Hype Cycle

  12. Phases of Linked Data / BIBFRAME Adoption Experimenters Early Implementers

    Data Publishers
 & Connectors Mainstream Workflow Back Office Systems • Clarify Space • Determine the Need • Define a Foundation • Draft Specifications • Test the Assumptions • Draft Standards • Evaluate Data, Processes, & Gaps • Begin to work at scale • Use other’s data • Participate – Publish, Share, Connect • “Final” Standards & Best Practices • New businesses and models • “There’s Linked Data in there!?”
  13. Phases of Linked Data / BIBFRAME Adoption

  14. Act 2 : Tools - Transformation

  15. MARC as “Things not Strings”

  16. MARC as “Things not Strings”

  17. MARC as “Things not Strings”

  18. MARC as “Things not Strings”

  19. MARC as “Things not Strings”

  20. hasInstance creator subject publisher publishedAt format Work Instance Authority Authority

    Authority Authority Authority BIBFRAME Core model for defining Web control points of bibliographic data for more effective sharing, navigation and collaboration Simple, replicable linked data patterns
  21. And we can replicate these simple patterns to define as

    many control points we need creator subject provider mediaCategory Work Instance Authority Agent Category Category agent place Agent Place carrierCategory Category contentCategory Language language Event
  22. None
  23. None
  24. None
  25. None
  26. None
  27. None
  28. A link is worth a 1000 words

  29. None
  30. None
  31. None
  32. In Summary • Highly connected graph of data • Completely

    dark to the Web
  33. Act 2.5 : Tools - creation

  34. Opportunity

  35. BIBFRAME Profiles 35

  36. Small is Beautiful • BIBFRAME common model - flexible, designed

    to accommodate the needs of our community. • Recognize creative tension between past and future • Recognize creative tension of being useful across communities, but also community specific • Profiles are a simple, small, subset to of the model to support a specific community or entity description but sharable in a global context 36
  37. { "Profile": { "id": "bfp:Monograph:Book", "title": "Monograph -- Book", "description":

    "An example profile reflecting the cataloging practices of example public library", "date": "2013-05-01", "contact": "Example Public Library cataloging help desk, info@examplelib.org", "resourceTemplate": [ { "id": "bfp:Work:Book", "resourceLabel": "Book", "resourceURI": "http://bibframe.org/vocab/Book", "propertyTemplate": [ { "propertyURI": "http://bibframe.org/vocab/titleStatement", "propertyLabel": "Title" "type": "literal" }, { "propertyURI": "http://bibframe.org/vocab/subject", "propertyLabel": "Subject" "type": "resource", "valueConstraint": { "valueTemplateRef": [ "bfp:Agent:Person", "bfp:Agent:Organization", "bfp:Authority:Place", "bfp:Authority:ClassificationEntity", "bfp:Authority:Topic" ] } }, …..
  38. None
  39. In Summary • Proof of concept extremely encouraging • Enormous

    potential for increased connectivity • No other community does authorities like we do • Control points for more effective discovery • Were making it extremely difficult to connect • Lower costs to linking is critical to improved visibility
  40. Act 3 : Visibility

  41. Expectations of Library Web Visibility “When my community searches the

    Web for something we have, we better show up as an option.” - Chuck Gibson, Director & CEO 
 Worthington Public Library
  42. Can’t ignore the problem…

  43. Start with Agreement and Purpose “Everyone benefits from the visibility

    of libraries and their content on the Web.”
  44. Learning though action together

  45. Practical Practitioners Community http://zepheira.com/training

  46. Moving from web pages to “a web of data”

  47. But we aren’t speaking in a way the Web understands

    We have a wealth of content and resources locked behind legacy, closed technology systems and niche vocabularies
  48. The traditional, Visible Web focuses on 
 Harvesting and Links

    to Pages
  49. The emerging Invisible Web focuses on 
 Data, Resources, Vocabulary,

    and Connections
  50. New Vocabularies and Characteristics
 Retail – items, reviews, geo, descriptions,

    inventory, hours, social, events
  51. New Vocabularies and Characteristics
 Movies – Geo, reviews, ratings, images,

    previews, times, tickets
  52. New Vocabularies and Characteristics
 Restaurants – locations, reviews, hours, reservations,

  53. How does the Web see Libraries?

  54. Libraries = Community Businesses
 Location, photos, hours, reviews, social, events

  55. If at all….

  56. External Perspectives • Are websites and systems harvestable? • Is

    there a unified and accessible industry vocabulary? • Are there strong connections and relationships? • What is the consistency and reliability of the user experience and available data?
  57. 60+ Pages later.... still not even one entry that had

    anything to do with Libraries This is the now
  58. None
  59. Hardcover This is what a search engine harvester sees. Unconnected

    data results in poor page rank.
  60. Hardcover isHeldBy isHeldBy isHeldBy isHeldBy isHeldBy Good

  61. Hardcover holds holds holds holds holds Better

  62. Hardcover holds holds holds holds holds holds holds holds Best!

    And Linked Data is a key
  63. A link is worth a 1000 words

  64. And this pattern is already happening in many localized markets

    as we speak
  65. None
  66. None
  67. But we’re very close • MARC To BIBFRAME (social) •

    Frustration with consolidation in marketplace (economic) • Web is increasingly actionable / semantic e.g. schema.org (technical)
  68. BIBFRAME Purpose and Promise • Purpose: Replacing MARC • Promise:

    So much more • Purpose: Serving Libraries • Promise: Related memory organizations and the users they serve • Purpose: Leverage existing Web standards to speak with a consistent voice • Promise: Visibility, Discovery and Effectiveness
  69. } Description Discovery Web Friendly

  70. “One Page Per Thing”

  71. None
  72. Moving the Needle and Transforming the Web • NO NEED

    TO WAIT • Build on existing investments • Use BIBFRAME to reflect content in the Web • Leverage the Web’s cooperative infrastructure • Link between shared & Web assets to test impact on results • Help the Web understand library vocabularies • Connect into legacy systems
  73. None
  74. Incremental Steps 1. Make it extremely easy to project Library

    data to Linked Data (BIBFRAME) 2. Start with Visibility – publish to the Web in a way the Web understands 3. Links! 4. RDFa (schema.org, BIBFRAME) 5. Increase discoverability 6. Accelerate linking among / across assets 7. Learn! Inform! Educate! Iterate!
  75. I We believe that everyone benefits from the visibility of

    libraries and their content on the Web. http://zepheira.com/linkeddatatraining-201501a/ http://libhub.org
  76. Getting Involved Learn more @ http://zepheira.com/solutions/library/ Code @ http://github.org/zepheira/pybibframe http://github.org/zepheira/bibframe-scribe

  77. Thank you Eric Miller em@zepheira.com