Upgrade to Pro — share decks privately, control downloads, hide ads and more …

"It's all in the {find} y'know" @ the BBC

Elastic Co
November 15, 2016

"It's all in the {find} y'know" @ the BBC

In this presentation Michael outlines how the Elastic Stack has been deployed to meet the challenges of maximising value from the BBC’s archives. Using real world examples see how today’s technology is unlocking yesterday’s content and a vision for the future of media search.

Michael Satterthwaite | Senior Product Manager | BBC

Elastic Co

November 15, 2016
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. 2 “It’s all in the {find} y’know” •  whoami • 

    The problem •  The solution (Part I) •  Why we doing this? •  The solution (Part II) •  Business Impact •  Making the problem harder •  What’s next?
  2. 3

  3. 5 What is the BBC Archive? •  TV programmes/rushes/stockshots (inc

    film reels and video tapes, online/offline digital files) •  Radio programmes/rushes (inc audio tape, vinyl, CDs, online/offline digital files) •  6m photographs (inc negatives, transparencies, prints and CDs) •  4m copies of sheet music •  Miles of hardcopy documents •  Old technology (inc TV cameras, microphones) •  Works of Art (inc paintings, statues) •  Metadata about programmes, holdings, broadcasts (inc Radio Times)
  4. 7 Prototyping and the BVP •  Methodology – almost agile

    •  Prototype Journey •  BVP* *(h8p://blog.ryantan.net/2013/03/the-barely-viable-product/)
  5. 11 Technologies & Frameworks Tech Description ElasJc Stack Performant search

    engine, out of the box query builder, system and user acJvity reporJng MongoDB Document store used for data staging prior to index Angular.js Client-side applicaJon framework Bootstrap Lots of nice CSS components LESS CSS Pre-processor Node.js Server-side Javascript pla[orm for lightweight, fast, network apps Express.js Web server framework which runs on top of Node Gulp Build tool
  6. 12 HTTP Loadbalancer (HA Proxy) NginX Web Portal UI Media/Metadata

    stores Rewind HTTP RESTful API A B C AuthenJcaJon MongoDB Enrichment Services Concept ExtracJon Speech-to-text Face recogniJon Text in images Media Services Proxy generator Thumbnail generator Subclipping Media movement Job Management Imp shim Imp shim Imp shim ++ Architecture
  7. 16

  8. 18 Business Impact •  TTAC – Time to access content

    reduced from a couple of days to minutes •  Unlocked assets to a new internal userbase •  Improved engagement with our audiences •  Someway along the journey to solving the problems we had to tackle
  9. 20 { } Damien Magee, Editor of BBC Newsline The

    Rewind Portal makes archive readily accessible and searchable – it’s really going to augment our content.
  10. 21 Changes to Audience Engagement Number of Engagements 100000 200000

    300000 400000 500000 600000 Assets available in Portal
  11. 23 Market Factors •  AcceleraJng rate at which content is

    being produced •  Move to self service model for archive access •  Declining cost of compute •  New algorithms available, machine learning at scale is here, Skynet looms
  12. 26 How good is good enough? •  Confidence levels • 

    Cross referencing sources at scale •  ElasJc allows us to explore
  13. 27