Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using linked data to annotate semantically the BBC's content

SWIB14
December 02, 2014

Using linked data to annotate semantically the BBC's content

Presenter: Tom Grahame

Abstract:
Linked Data at the BBC emerged as a set of ideas, techniques and technologies to build websites and has gone on to show how those techniques can improve and simplify production workflows, and provide interesting automated aggregations for our audiences. The success of applying the technology to deliver the online coverage of major sporting events has demonstrated the potential for reusing the semantic infrastructure as a central part of the BBC production workflow. To support this decision, the vision of semantic publishing at the BBC evolved towards connecting content around the things that matter to our audiences - those things can be politicians, athletes or musicians, places or organisations, topics of study or events.

The BBC produces a plethora of content every day about these things and the content varies from news articles, to programmes, to educational guides, clips and recipes. Because it is commissioned and used in different audience facing products, this content is mastered in separate and disconnected systems, yet – the things that the content is about are the same. By semantically describing and annotating the content with the things it is about, we enable journalists and content editors to access heterogeneous and previously isolated creative works in a unified manner.

In this talk I will describe how the BBC Sport's use of Linked Data has evolved from developing a single website covering the 2010 World Cup to supporting the annotation and dynamic aggregation of daily Sports coverage and every major event including London 2012, Sochi 2014 and the 2014 FIFA World Cup. I will also discuss how the same platform and technology approach is being deployed across the BBC in domains as diverse as Education, News, Radio and Music and how a Linked Data approach could be applied to similar challenges in the Library environment.

SWIB14

December 02, 2014
Tweet

More Decks by SWIB14

Other Decks in Technology

Transcript

  1. Using Linked Data to
    annotate semantically the
    BBC's content
    Tom Grahame
    Data Architect, BBC Future
    Media
    @tfgrahame

    View full-size slide

  2. Classification
    Annotation
    (tagging)
    Topics
    Locations

    View full-size slide

  3. Tagging
    Web Page
    publishing
    Results/Fi
    xtures
    Navigatio
    n

    View full-size slide

  4. http://www.bbc.co.uk/ontolo
    gies

    View full-size slide

  5. • Developed by BBC R&D
    • Builds an internal model
    using a corpus of Wikpedia
    page content
    • Parses content and
    matches topics to DBpedia
    URIs
    • Performs Named Entity
    Recognition and
    Disambiguation
    Mango – A tool supporting automatic tag suggestion
    [email protected]

    View full-size slide

  6. • Governance
    • Legacy data
    • Errors
    • New technology
    • Understanding

    View full-size slide

  7. ttp://www.library.manchester.ac.uk/aboutus/projects
    eSchola
    r
    Institutional
    Repository Project
    2007 -
    Present
    MISS MaDAM Into
    Sustainable Services
    2011-13
    MaDAM Manchester Data
    Management
    2009-10
    CAIRO Complex Archive
    Ingest For Repository
    2006-08

    View full-size slide

  8. Thank you!
    Questions?

    View full-size slide

  9. Image credits
    • All fragments of websites and tools are screenshots of BBC resources
    • Mango - http://upload.wikimedia.org/wikipedia/commons/c/c0/Mango_(1).jpg
    • John Rylands Library Interior - http://upload.wikimedia.org/wikipedia/commons/8/88/
    The_John_Rylands_Library_Interior.jpg
    • Main Library Exterior - http://upload.wikimedia.org/wikipedia/commons/8/87/
    MainLibraryExterior.jpg
    • Sport Climber - http://upload.wikimedia.org/wikipedia/commons/6/6a/
    Sport_Climbing.jpg

    View full-size slide