Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Moving from MARC: How BIBFRAME moves the Linked Data in Libraries conversation to large-scale action

SWIB14
December 03, 2014

Moving from MARC: How BIBFRAME moves the Linked Data in Libraries conversation to large-scale action

Presenter: Eric Miller (Zepheira)

Abstract:
As the Library of Congress looked to the future of MARC, they looked to Linked Data principles and Semantic Web standards as the foundation of BIBFRAME. Libraries have an extensive history with MARC as a sophisticated and highly customized descriptive vocabulary with billions of records spread across systems and providers. In order to recognize the value of connecting this legacy in new and contemporary ways, BIBFRAME’s design is intentionally extensible with Profile-based vocabularies, flexible transformation utilities, and iterative linking strategies in mind. The migration from MARC (and other related Library standards) to BIBFRAME offers the most widely actionable opportunity for libraries to adopt Linked Data as a foundation of their Web visibly and internal operations.
This session will include a review of practical tools we have used in helping libraries:
• evaluate their current data
• define local data priorities
• perform large-scale transformation
• create profile-based definitions for original content
• identify linking options
• move beyond simply representing legacy data to take full advantage of the Linked Data nature of Web vocabularies like BIBFRAME and schema.org.
We benefit from looking back at the history of how Libraries have helped shape the Web of Data to the future of how now given these standards, we together can raise the visibility of libraries on the Web.

SWIB14

December 03, 2014
Tweet

More Decks by SWIB14

Other Decks in Technology

Transcript

  1. Moving from MARC: How BIBFRAME
    moves the Linked Data in Libraries
    conversation to large-scale action
    SWIB 2014
    Semantic Web in Libraries
    December, 3, 2014
    Bonn, Germany
    Eric Miller
    [email protected]
    https://www.linkedin.com/in/erimille
    @erimille

    View Slide

  2. I believe that everyone benefits from
    the visibility of libraries and their
    content on the Web.

    View Slide

  3. Extremely encouraged
    • Tom Grahame - value proposition of “one page per thing”
    • Lessons learned from Europeana - “quantity has a quality all its own”
    • D-SWARM: middleware designed to empower domain experts
    (librarians)
    • Aliada - accelerate the publication of Library data in the Linked
    Open Data
    • Dan Scott - speak in the way the web understands
    • Richard Wallis - things not strings
    • #swib14

    View Slide

  4. A talk in 3 acts

    View Slide

  5. Act 1 : Context

    View Slide

  6. Minimum Viable Product, 

    Incremental Value, and Continuous
    Learning

    View Slide

  7. RDF
    7

    View Slide

  8. hasInstance
    creator
    subject
    publisher
    publishedAt
    format
    Work
    Instance
    Authority Authority
    Authority
    Authority
    Authority

    View Slide

  9. BIBFRAME Vocabulary
    9

    View Slide

  10. Linked Data
    "a term used to describe a recommended best
    practice for exposing,
    sharing, and
    connecting pieces of
    data,
    information, and knowledge on the Semantic
    Web using URIs and RDF."
    http://en.wikipedia.org/wiki/Linked_Data
    10

    View Slide

  11. General Technology Hype Cycle

    View Slide

  12. Phases of Linked Data / BIBFRAME Adoption
    Experimenters
    Early
    Implementers
    Data Publishers

    & Connectors
    Mainstream
    Workflow
    Back Office
    Systems
    • Clarify Space
    • Determine the
    Need
    • Define a
    Foundation
    • Draft
    Specifications
    • Test the Assumptions
    • Draft Standards
    • Evaluate Data, Processes, & Gaps
    • Begin to work at scale
    • Use other’s data
    • Participate – Publish, Share,
    Connect
    • “Final” Standards & Best
    Practices
    • New businesses and
    models
    • “There’s Linked
    Data in there!?”

    View Slide

  13. Phases of Linked Data / BIBFRAME Adoption

    View Slide

  14. Act 2 :
    Tools - Transformation

    View Slide

  15. MARC as “Things not Strings”

    View Slide

  16. MARC as “Things not Strings”

    View Slide

  17. MARC as “Things not Strings”

    View Slide

  18. MARC as “Things not Strings”

    View Slide

  19. MARC as “Things not Strings”

    View Slide

  20. hasInstance
    creator
    subject
    publisher
    publishedAt
    format
    Work
    Instance
    Authority Authority
    Authority
    Authority
    Authority
    BIBFRAME
    Core model for defining Web
    control points of bibliographic
    data for more effective
    sharing, navigation and
    collaboration
    Simple, replicable linked data
    patterns

    View Slide

  21. And we can replicate
    these simple patterns
    to define as many
    control points we need
    creator
    subject
    provider mediaCategory
    Work
    Instance
    Authority Agent
    Category
    Category
    agent
    place
    Agent
    Place
    carrierCategory
    Category
    contentCategory
    Language language
    Event

    View Slide

  22. View Slide

  23. View Slide

  24. View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. A link is worth a 1000 words

    View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. In Summary
    • Highly connected graph of data
    • Completely dark to the Web

    View Slide

  33. Act 2.5 :
    Tools - creation

    View Slide

  34. Opportunity

    View Slide

  35. BIBFRAME Profiles
    35

    View Slide

  36. Small is Beautiful
    • BIBFRAME common model -
    flexible, designed to accommodate
    the needs of our community.
    • Recognize creative tension between
    past and future
    • Recognize creative tension of being
    useful across communities, but also
    community specific
    • Profiles are a simple, small, subset
    to of the model to support a specific
    community or entity description but
    sharable in a global context
    36

    View Slide

  37. {
    "Profile": {
    "id": "bfp:Monograph:Book",
    "title": "Monograph -- Book",
    "description": "An example profile reflecting the cataloging practices of example public library",
    "date": "2013-05-01",
    "contact": "Example Public Library cataloging help desk, [email protected]",
    "resourceTemplate": [
    {
    "id": "bfp:Work:Book",
    "resourceLabel": "Book",
    "resourceURI": "http://bibframe.org/vocab/Book",
    "propertyTemplate": [
    {
    "propertyURI": "http://bibframe.org/vocab/titleStatement",
    "propertyLabel": "Title"
    "type": "literal"
    },
    {
    "propertyURI": "http://bibframe.org/vocab/subject",
    "propertyLabel": "Subject"
    "type": "resource",
    "valueConstraint": {
    "valueTemplateRef": [ "bfp:Agent:Person",
    "bfp:Agent:Organization",
    "bfp:Authority:Place",
    "bfp:Authority:ClassificationEntity",
    "bfp:Authority:Topic" ]
    }
    },
    …..

    View Slide

  38. View Slide

  39. In Summary
    • Proof of concept extremely encouraging
    • Enormous potential for increased connectivity
    • No other community does authorities like we do
    • Control points for more effective discovery
    • Were making it extremely difficult to connect
    • Lower costs to linking is critical to improved visibility

    View Slide

  40. Act 3 : Visibility

    View Slide

  41. Expectations of Library Web
    Visibility
    “When my community searches the Web for
    something we have, we better show up as an
    option.”
    - Chuck Gibson, Director & CEO 

    Worthington Public Library

    View Slide

  42. Can’t ignore the problem…

    View Slide

  43. Start with Agreement and
    Purpose
    “Everyone benefits from the visibility
    of libraries and their content on the
    Web.”

    View Slide

  44. Learning though action together

    View Slide

  45. Practical Practitioners Community
    http://zepheira.com/training

    View Slide

  46. Moving from web pages to
    “a web of data”


    View Slide

  47. But we aren’t speaking in a way
    the Web understands
    We have a wealth
    of content and
    resources locked
    behind legacy,
    closed technology
    systems and niche
    vocabularies

    View Slide

  48. The traditional, Visible Web focuses on 

    Harvesting and Links to Pages

    View Slide

  49. The emerging Invisible Web focuses on 

    Data, Resources, Vocabulary, and Connections

    View Slide

  50. New Vocabularies and Characteristics

    Retail – items, reviews, geo, descriptions, inventory, hours, social, events

    View Slide

  51. New Vocabularies and Characteristics

    Movies – Geo, reviews, ratings, images, previews, times,
    tickets

    View Slide

  52. New Vocabularies and Characteristics

    Restaurants – locations, reviews, hours, reservations, menus

    View Slide

  53. How does the Web
    see Libraries?

    View Slide

  54. Libraries = Community Businesses

    Location, photos, hours, reviews, social, events

    View Slide

  55. If at all….

    View Slide

  56. External Perspectives
    • Are websites and systems harvestable?
    • Is there a unified and accessible industry
    vocabulary?
    • Are there strong connections and relationships?
    • What is the consistency and reliability of the user
    experience and available data?

    View Slide

  57. 60+ Pages later.... still not
    even one entry that had
    anything to do with
    Libraries
    This is the now

    View Slide

  58. View Slide

  59. Hardcover
    This is what a search engine harvester sees.
    Unconnected data results in poor page rank.

    View Slide

  60. Hardcover
    isHeldBy
    isHeldBy
    isHeldBy
    isHeldBy
    isHeldBy
    Good

    View Slide

  61. Hardcover
    holds
    holds
    holds
    holds
    holds
    Better

    View Slide

  62. Hardcover
    holds
    holds
    holds
    holds
    holds
    holds
    holds
    holds
    Best!
    And Linked Data is a key

    View Slide

  63. A link is worth a 1000 words

    View Slide

  64. And this pattern is already happening
    in many localized markets as we speak

    View Slide

  65. View Slide

  66. View Slide

  67. But we’re very close
    • MARC To BIBFRAME (social)
    • Frustration with consolidation in marketplace
    (economic)
    • Web is increasingly actionable / semantic e.g.
    schema.org (technical)

    View Slide

  68. BIBFRAME
    Purpose and Promise
    • Purpose: Replacing MARC
    • Promise: So much more
    • Purpose: Serving Libraries
    • Promise: Related memory organizations and the users they serve
    • Purpose: Leverage existing Web standards to speak with a consistent
    voice
    • Promise: Visibility, Discovery and Effectiveness

    View Slide

  69. }
    Description Discovery
    Web Friendly

    View Slide

  70. “One Page Per Thing”

    View Slide

  71. View Slide

  72. Moving the Needle and
    Transforming the Web
    • NO NEED TO WAIT
    • Build on existing investments
    • Use BIBFRAME to reflect content in the Web
    • Leverage the Web’s cooperative infrastructure
    • Link between shared & Web assets to test impact on results
    • Help the Web understand library vocabularies
    • Connect into legacy systems

    View Slide

  73. View Slide

  74. Incremental Steps
    1. Make it extremely easy to project Library data to Linked Data
    (BIBFRAME)
    2. Start with Visibility – publish to the Web in a way the Web understands
    3. Links!
    4. RDFa (schema.org, BIBFRAME)
    5. Increase discoverability
    6. Accelerate linking among / across assets
    7. Learn! Inform! Educate! Iterate!

    View Slide

  75. I We believe that everyone benefits
    from the visibility of libraries and their
    content on the Web.
    http://zepheira.com/linkeddatatraining-201501a/
    http://libhub.org

    View Slide

  76. Getting Involved
    Learn more @
    http://zepheira.com/solutions/library/
    Code @
    http://github.org/zepheira/pybibframe
    http://github.org/zepheira/bibframe-scribe

    View Slide

  77. Thank you
    Eric Miller
    [email protected]

    View Slide