Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Creating Scalable APIs & API Performance

Creating Scalable APIs & API Performance

In his talk “Scaling APIs (& API Performance)” Patrick will give an overview of the challenges and difficulties of a well performing API, which components play a part and how these components can be optimised and scaled properly. He’ll walk through on how to identify low performing components and show some tools that make life as an API architect easier. He’ll try to differentiate what a modern API should do, and what it should not, presenting new patterns and systems to optimise performance and ultimately, the user experience, across consumer devices. The talk will be technically oriented and gives some insights into the daily routines and problems one can run into when designing an API for consumers. Given the different technologies, languages and frameworks an API can operate with, Patrick will try to be platform agnostic and won’t go into too much detail for the various platforms themselves. As a Software Architect and CEO at blended.io, Patrick designs and crafts scalable high performance and mobile-friendly APIs for various startups and companies which operate within 25ms response time and with millions of requests load. He scaled systems to data-centers around the world to guarantee optimal user experience for API consumers.

PatrickHeneise

May 06, 2015
Tweet

More Decks by PatrickHeneise

Other Decks in Technology

Transcript

  1. –Guy Kawasaki “I’ve never seen a startup die because it

    couldn’t scale fast enough. I’ve seen hundreds of startups die because people refused to embrace their product.”
  2. Patrick Heneise Software Architect at blended technologies S.L. MSc in

    Media Technology BSc in Computer Science in Media Startup Mentor Co-Organiser MediterráneaJS, BarcelonaJS, NodeBCN & CoreOS Barcelona
  3. –Wikipedia “A set of routines, protocols, and tools for building

    software applications.” Application Programming Interface
  4. APIs • could send a lot of emails • could

    send a lot of other notifications • could do background jobs • could do heavy computation • don’t render web content (HTML)
  5. Unit tests Core Billing #subscribe ✓ fail to upgrade subscription

    if no subscription exists ✓ create inactive subscription in database ✓ modify inactive subscription in database ✓ fail to subscribe with an invalid period ✓ fail to subscribe to a non-existing plan 1) create inactive subscription in database 2) modify inactive subscription in database ✓ create active subscription in database and stripe (1358ms) ✓ fail to subscribe if active subscription already exists
  6. End-to-End (E2E) Tests Billing POST /billing/subscribe ✓ should fail to

    subscribe to a non existing plan ✓ should pre-subscribe user to basic yearly plan
  7. Data transfer, optimise in- and output • Use JSON •

    Minimise transfer data ({ dont_use_excessive_attribute_names: true; }) • Smart endpoints instead of REST-only endpoints (give users the data they need in one call)
  8. Data storage, use the right database for the job •

    MySQL for session data? • NoSQL for relational data?
  9. Node distance • use nodes within the same subnet •

    use 'Private Networking' / internal network interfaces • try to have the nodes physically close to each other. The best database doesn't help if it's in China and your API server in Barcelona.
  10. Async it! • Does the user really need to know

    (and wait for it) if that email has been sent?
  11. Request and response optimisation • Provide consumers with the data

    they need (to reduce requests)
 Example Twitter: Timeline + user info, no need to get every users details separately. Example Instagram: Notify consumers when there is new content instead of letting them poll every X seconds (and generate request)
  12. Data optimisation • Implement caching layers where possible and feasible

    • Cache when it makes sense!
 Example: Doing complex geo-location searches over a big ElasticSearch index? Cache the result in Redis.
  13. Remember all that data you measured? • E-Mail, notifications and

    other things that involve 3rd party services • background jobs • computation jobs • cron jobs • anything else that takes more than 25ms to respond
  14. Microservices • Anything that doesn't require immediate response to the

    user can be done asynchronously by a micro-service. • Example: Using a 3rd party service can add precious time to a response. Instead of waiting, respond with HTTP 102 or 202 (request accepted, processing pending).
  15. The larger the components, the harder they are to replace.

    It's easier to change a tyre than the motor.
  16. Software • Database • Clustering • Sharding • find the

    right database • Web Server • find the optimal web server or proxy (nginx, Apache, haproxy, ...) • API platform • find the optimal platform (node.js, Python, Ruby on Rails, ...) • Assets • Amazon S3 • CDN
  17. “Your users around the world don't care that you wrote

    your own DB” –Mike Krieger, Co-Founder @ Instagram
  18. Vertical Hardware Scaling • Add more memory • Add more

    computation power • Reboot • Double capacity != double scale / speed
  19. Horizontal Hardware Scaling • Add Database nodes • Add API

    platform nodes • Add Computation nodes • Add node locations (cross-datacenter, distance to the consumer) Building a Barcelona startup with a cluster in US-WEST and wondering why your API response time is high?
  20. Horizontal vs. Vertical Hardware Scaling + Cheaper + Faster to

    implement + Adds node redundancy & security - Requires extra nodes for load balancing
  21. 1. Measure. Find the bottleneck 2. Can performance be improved

    with a software fix? 3. Can you exchange the component? 4. Can you scale vertically? 5. Scale horizontally.