Relational taxonomic meta model 2) Static! Inflexible! SSI! 3) Document publishing 4) Content non re-usable 5) Content non repurpose-able 6) Difficult to personalize 7) Publication per output
736 players 776 pages 2. Fixtures & Results, Groups & Teams pages 3. To many web pages for too few journalists 4. Improve the publishing system to help achieve all of this World Cup 2010
• Huge increase in content breadth (number of manageable pages) • Content re-use and re-purposing, increasing reach • Simplified content management • Journalist headcount reduction • Multi-dimensional entry points and semantic navigation • Improved user experience with high levels of user engagement • Dynamic, state (time|event) and semantic driven page layout • Personalized content aggregations • Open data and API’s
Squad, Group, etc..) • Average unique page requests a day : 2 million + • Average OWLIM SPARQL queries a day : 1 million • 100s RDF statement updates/inserts per minute with full OWL reasoning and associated inference. • Multi data center fully resilient, clustered 6 node triple store • RDF graph model ideally suited to model domain representations such as sport World Cup statistics the GOOD
static • Sport content not responsive or personalized • RDF Store unable to handle thousands of statistic updates a second • RDF Store forward-chained closures expensive increase write latency • RDF graph model and SPARQL not ideally suited to the BBC’s News and Sport document publication model World Cup statistics the BAD
per Athlete [10,000+], Page per country [200+], Page per Discipline [400-500], Page per venue, Page per team A lot of output… • Almost real time statistics and live event pages • Time coded, metadata annotated, on demand video, 58,000 hours of content • Far too many web pages for far too few journalists • DSP annotation architecture to automate content aggregation
Store 1. Atomic content assets stored in MarkLogic XML store 2. XML content queryable via Xquery 3. Content Assets searchable 4. Sports statistics searchable/queryable via XQuery 5. Ontological SPARQL via BigOWLIM, assets Xquery via MarkLogic
Accessible API GET https://api.live.bbc.co.uk/sportsdata/statsapi/football/table/ais/competition/118996114 GET https://api.live.bbc.co.uk/sportsdata/statsapi/football/table/ais/competition/118996114 Accept: application/json GET https://api.live.bbc.co.uk/sportsdata/statsapi/football/videprinter GET https://api.int.bbc.co.uk/sportsdata/statsapi/formula1/year/2012/calendar Accept: application/json etc……etc…..etc….
story data on news index XML <item> <xi:include href="http://www.bbc.co.uk/asset/13447877" xpointer="xmlns(bbc=http://www.bbc.co.uk/content/asset) xpointer(/bbc:story/bbc:itemMeta)"> <xi:fallback> <!-- Unable to find href="http://www.bbc.co.uk/asset/13447877" xpointer="xmlns(bbc=http://www.bbc.co.uk/content/asset) xpointer(/bbc:story/bbc:itemMeta)" --> </xi:fallback> </xi:include> ...
data on news index XML HTTP GET https://api.live.bbc.co.uk/content/asset/news/technology/ HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/json Or HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/xml Contextualised output • Audience • Platform • Response type
data on news index XML HTTP GET https://api.live.bbc.co.uk/content/asset/news/uk-17829360 HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/json Or HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/xml
use fully dynamic approach (News Mobile style) BBC news high web site re-engineered to use fully dynamic approach (News Mobile style) Real-time Olympics 2012 stats and video overlay Upgrade to MarkLogic 5 MarkLogic XA Transactions (Removing handcrafted Xquery for master/master replication) MarkLogic Binary storage R&D Etc…. Platform future…..