Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Guardian's Content API

philwills
October 28, 2011

The Guardian's Content API

A largely non-technical review of why we built the Content API/Open Platform and how doing so has changed the way we work and engage with partners.

philwills

October 28, 2011
Tweet

More Decks by philwills

Other Decks in Technology

Transcript

  1. • The Guardian • What is the Content API? •

    What is the Open Platform? • How they have changed us • What’s next? Friday, 28 October 2011
  2. The Guardian • History • Scott Trust • Open as

    a business strategy Friday, 28 October 2011
  3. To secure the financial and editorial independence of the Guardian

    in perpetuity To promote freedom in the press and liberal journalism globally Friday, 28 October 2011
  4. "Our most interesting experiments lie in combining what we know

    with the experience, opinions and expertise of the people who want to participate rather than passively receive.” Friday, 28 October 2011
  5. • Almost all our content ≈ 1.4 million pieces •

    Most freely available • HTTP • JSON or XML • RESTish Friday, 28 October 2011
  6. Main DB Solr Master Solr Replica Solr Replica Solr Replica

    Lift webapp Lift webapp Lift webapp Friday, 28 October 2011
  7. Access levels • Keyless: Take our headlines. You keep associated

    revenues. • Approved: Take our full article content, but with an advert. Guardian keeps ad revenue, you keep rest-of-page revenue. • Partner: Take, reformat, augment our content. Revenue model to be negotiated. Combination of Media, Fees, Downloads. • Internal Friday, 28 October 2011
  8. It’s simple, isn’t it? • JSON/XML over HTTP • Simplest

    solution we could envisage • Still overwhelming for some of our partners Friday, 28 October 2011
  9. • Didn’t try and be Facebook • HTTP • HTML

    • Tools for existing CMS Friday, 28 October 2011
  10. Highlighted existing issues • ZZZ (do not use) • Rights

    management not fully integrated • Forced cross-discipline work Friday, 28 October 2011
  11. A focus for modelling • “How will it look in

    the API?” • Explicit, shared view of the model Friday, 28 October 2011
  12. Beyond wireframes • Not just Lorem Ipsum • Enables quick

    prototyping with real data by anyone with basic technical skills Friday, 28 October 2011
  13. Context Boundaries • CMS was monolithic • Break out concerns

    beyond the core • Core CMS now shrinking Friday, 28 October 2011
  14. Development process • Loosely coupled systems enable loosely coupled teams

    • Frequent, independent releases without incidental complexity • Greater experimentation • A bit more documentation Friday, 28 October 2011
  15. Technical Influence • JSON in API - good representation of

    content model • Relational schema - not so good 㱺 Document oriented datastore Friday, 28 October 2011
  16. Technical Influence • API showed natural point for automated integration

    testing • Now have a number of non-public APIs of a similar form Friday, 28 October 2011
  17. External Dev Engagement • Not just answering questions on a

    mailing list • Many devs wanted to build Guardian products • Making best use requires whole business • Business processes not always agile enough Friday, 28 October 2011
  18. Richer Content • Mostly text at present • Yet more

    complicated rights to manage Friday, 28 October 2011
  19. Content in • Get our own content not in the

    core CMS • Allow partners to use our infrastructure Friday, 28 October 2011
  20. Internal tools • More accessible reporting • Improved stats •

    Finer-grained access control Friday, 28 October 2011
  21. • Bigger change than we anticipated • As much change

    internally as externally • People seeking us out who wouldn’t otherwise • Transformed our view of our own content Friday, 28 October 2011