API theory vs practice, an honest review of ARTE’s API strategy

API theory vs practice, an honest review of ARTE’s API strategy

When designing an API, we often believe we precisely know what we want, and that our first release will be a success. In reality, it turn out that API design is more of an iterative process. As the Project Manager in charge of APIs at ARTE, I will give you my feedback on the challenges we have faced and the choices we have made, from the launch of our first API in 2012 to our 3rd generation in 2018.

63ca7d8ed4445ae16e003a25e4bd7d89?s=128

Matthieu BREEN

December 11, 2018
Tweet

Transcript

  1. None
  2. None
  3. 12K+ titles in French and German 1.5K+ titles with subtitles

    in: •  English •  Spanish •  Polish •  Italian 1000+ hours of recorded live concerts Free to use / No subscription
  4. None
  5. None
  6. < 2012

  7. Online stuff was handled by a single CMS (monolith) – 

    WYSIWYG made editors happy but : •  No design coherence across our catalog •  Content was randomly structured –  Worked well for the web, not for Apps (Mobile / TV) –  Had to cope with a huge archive ! •  Website was getting slower as size of the catalog increased –  Costly to maintain / Nearly impossible to upgrade •  Got a 500K€ quote for an upgrade
  8. CMS (Poorly structured content) Data Interface 1 Mobile APP Website

    Data Interface 3 New APP Data Interface 3 Editors For every new App, we would develop a dedicated data interface In consequence •  New App development was slow •  Could not develop Apps for each platform •  Apps only provided basic functionalities •  Apps were rarely updated
  9. None
  10. CMS (Poorly structured content) CMS (Poorly structured content) Editors

  11. REST API / JSON Mobile APP Website New APP APIOS

    (BROADCAST TOOL – Structured data) SINGLE DATA INTERFACE Editors Every App should shares the same data interface
  12. Objectives –  Accelerate front-end development of our own Apps /

    websites –  Make Apps with rich features & good performance How –  Provide developers with the data they need •  By exposing a REST API with structured data in JSON –  Use this API for every new APP development –  Reduce the amount of errors due to misunderstanding •  By exposing a simplified version of our data model •  By providing an exhaustive documentation with examples
  13. By the end of 2012 our first API “papi“ was

    born (PAPI was developed in JAVA and used a NoSQL database) What worked ? •  Developers were thrilled to have access to an API •  The structured content could be used on all devices •  Sped up App development by near 2x •  We could be present on more platforms (Android / Windows…) •  In fairness, API v1 worked so well we wanted to extend it for external use (partnerships)
  14. Limitations •  Documentation stored in our internal wiki •  Hard

    to keep the documentation up to date •  Hard to give access to the documentation •  Catalog restrictions •  API v1 only featured primary content •  Catalog could only be udated a few times a day (really slow cronjob updates) •  One API for all strategy •  Our single API started to get huge, and was hard to maintain •  Our search engine was not great •  Every App had to implement it’s own search engine using local storage •  Did not support HTTPS •  Our strategy to reduce size of json files was a failure
  15. { "VID": "082245-000", // Id of the video "VTI": "Coal

    Wars", // title of the video "VDE": "The costs of moving on from coal", // Video description "VDU": 3120, // Duration of the video in seconds "VSO": "ARTE+7" // Source of the video … } API v1 : Strategy to reduce JSON file size by using 3 letter codes as attribute names 1.  The gains provided by this strategy are negligible after Gzip compression 2.  This naming style is catastrophic regarding attribute comprehension
  16. Better approach { "id": "082245-000", // Id of the video

    "type": "video", // type of the video "title": "Coal Wars", // title of the video "description": "The costs of moving on from coal", // Description "duration": 3120, // Duration of the video in seconds "source": "ARTE+7" // Source of the video … } Choosing explicit names for your attributes is the best form of documentation
  17. None
  18. Split our big API Website / APP PAPI (v1)

  19. Split our big API into multiple smaller APIs easier to

    manage: Data API/ oAuth API / Player API… Website / APP API1 API2 APIn
  20. Each API has to manage: 1)  TLS (https) 2)  Authorization

    (Access roles) 3)  Throttling (Rate-limit) 4)  Cache Website / APP API1 API2 APIn TLS Auth Throttling Cache TLS Auth Throttling Cache TLS Auth Throttling Cache
  21. API Proxy Shared for all ARTE APIs Create an API

    proxy to enable the sharing of: 1)  TLS (https) 2)  Authorization (Access roles) 3)  Throttling (Rate-limit) 4)  Cache Therefore API developers can concentrate on core development Website / APP API1 API2 APIn TLS Auth Throttling Cache
  22. API1 Build a developer portal Every API must exposes it’s

    documentation to the dev portal (OpenAPI format) Developer needs to be able to: •  Create an account / validate terms of use •  Access documentation •  Test requests in a sandbox •  Fetch an access token ARTE needs: •  A users administration console •  To be able to configure the access roles and throttling of each APP API2 APIn Dev Developer portal Doc Doc Doc
  23. Objectives •  Improve API Catalog •  ARTE’s complete video catalog

    should be accessible and searchable •  API catalog should be updated in near real time (RabbitMQ based updates) •  Make it easier for developers •  Give developers access to a developer portal / sandbox •  Only allow access to required resources based on user roles •  Use an API standard: JSON API •  Supporting API versioning •  Industrialize the creation of APIs •  Create multiple small APIs, each focused on a given purpose •  Auto-generate documentation based on code annotations (swagger / OpenAPI) •  Be able to monitor user traffic by sending logs to a dedicated stack (ELK) •  Improve API Security (HTTPS, oAuth…) •  Find new API usage : Hacks, POCs, R&D and partnerships
  24. By the end of 2014 our second generation of APIs

    were born : •  “opa“: the main data api (php Symfony / MongoDB) •  “api-player” (php Symfony / no storage) •  “oAuth”: oAuth + hosting of dev portal (php Symfony / Sonata Admin / MongoDB)
  25. None
  26. None
  27. None
  28. None
  29. What worked ? •  Improved catalog •  Developer portal helped

    us to industrialise API use •  Auto generated documentation really made APIs easy to maintain •  Apis could now be used for Hacks / POCs / R&D •  We introduced a dedicated image resizing server to improved App performance (Thumbor) •  Monitoring logs with ELK
  30. None
  31. Limitations •  JSON API include system for linked ressources • 

    Not well understood by all developers •  Adapt content based on user roles •  Hard to maintain, poor performance •  Main issue : different content would be displayed on different APPs •  Web vs Mobile vs TV (API requests were not exactly the same) What was never used •  API v2 supports versioning of minor versions The power of the almighty sandbox •  When you give developers access to an API portal : •  Developers “never” ask questions •  You no longer fully control the use of your API Discovered that APIs are not for all partnerships
  32. Partners Tech knowledge Partner’s interest in the exchange LOW HIGH

    LOW HIGH PARTNERS API YOUR API XML & RSS FEEDS Choosing the right DATA exchange tool
  33. None
  34. To display a single page, an App is required to

    make multiple API calls ARTE App API1 API2 APIn 1 2 3 … End user
  35. –  Guarantee that the same content is displayed on each

    device / APP –  The middleware can evolve quickly with the product, whereas the APIs data model stays roughly the same With the use of a middleware, all required data to display a page can be fetched in a single call ARTE App API1 API2 APIn 1 2 3 … End user API Middleware ( Node.js / GraphQL ) Create an API Middleware to implement the 1 page 1 call philosophy
  36. Objectives –  1 page 1 call philosophy –  Functional driven

    endpoints, such as: •  Most viewed videos •  Most recent videos •  …
  37. API v3 was born late 2017 What worked ? • 

    1 page 1 call •  Reduces possibility for error •  Improved speed •  We would now display the same content on all Apps (web vs mobile vs TV) What did not ? •  API usage is still mostly limited to Internal use
  38. 60% Direct Internal use www.arte.tv ; Club ; mobile Apps

    (iOS / Android / Windows); TV Apps (HBBTV, set-top-box); Corporate website ; Journalist webstite… 10% Direct external use Molotov ; Plurimedia ; Roku ; ARD… 30% Partner Feeds (XML / RSS) Data Feed generated for CanalSat; youtube ; Free, Orange; DT ; TDF ; Apple universal Search; Tvspielfilm… 5% Marketing & SEO Marketing tools & SEO : Sitemaps ; Outbrain ; Keywee… 5% Hacks & POC Hackathons ; Alexa ; spotify ; Deezer; Fraunhoffer; Artefact…
  39. Hard to keep the data model in sync when using

    a noSQL database •  Probably switch to an SQL database for primary data storage (in combination with Elasticsearch for document storage) Improve our cache hit rate •  Switch to cache preloading (varnish) and tag based purge Partners using deprecated versions of the API: •  If you don’t deprecate, you will lose velocity over time •  If and API would be a job contract, consider it a temporary one •  Hard to remove old API support if it still makes a lot of video views.
  40. API design is more of an iterative process. •  For

    a first version concentrate on your MVP and start from there •  Admit your failures early, take credit for your mistakes and grow on them •  Choose extending over versioning •  Try and build functional driven endpoints API limitations •  APIs are not for everyone: learn when to use them. •  Keep in mind that an API is a temporary contract To make great apps, you need structured data Importance of a good documentation : •  Choosing explicit names for your attributes is the best form of documentation •  Try and auto generate the documentation based on code annotations •  Be aware of the power of the almighty sandbox Consider using an API middleware close to your front-end needs
  41. None
  42. None
  43. API Proxy 15K rpm (average) 40K rpm (peak) Average client

    response time: 10ms Cache rate ≈ 70% Hosting 12 production VMs 9 preprod / dev VMs APIs 5K rpm (average) 14K rpm (peak) Average back-end time: 120 ms Release frequency : 1.5 release/week