Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Build a Scalable API

How to Build a Scalable API

Tips and tricks for building scalable API's.

Talk from API Strategy Conference - February 2013

Travis Reeder

February 21, 2013
Tweet

More Decks by Travis Reeder

Other Decks in Programming

Transcript

  1. Who am I? Travis Reeder CTO and Co-Founder of Iron.io

    Iron.io provides scalable and elastic cloud infrastructure services: IronMQ, IronWorker and IronCache. Building things to scale is our business.
  2. Iron.io API's • 100M+ API requests per day • 300K+

    jobs executed per day on IronWorker • And growing... • 100% uptime past 30 days • 99.98% uptime past 6 months As of Feb. 12
  3. What is Scalability? Scalability is the ability of a system,

    network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. source: wikipedia
  4. In other words... You must be able to grow by

    throwing more hardware at it.
  5. Bonus: Increased reliability Build to scale == more reliable. Redundancy

    Easy to provision new resources Easier transition to HA
  6. Choose the right infrastructure • Use the cloud • A

    cloud that has the ability to truly launch servers on demand and will let you launch a lot of them ◦ AWS comes to mind
  7. Choose the right load balancer • Using Amazon? Use ELB

    • Using Rackspace? Use Rackspace Load Balancer • Using something else? Throw a good LB like nginx on a few boxes and point dns to them.
  8. Choose the right data store • Probably the most important

    decision • Choose one that scales. For reals. • Mongodb, Riak, etc are built to scale. • MySQL/Postgres/etc don't truly scale.
  9. Choose the right language and framework • Ruby on Rails

    == bad • Go == good • Not truly related to scalability because you can always throw more hardware at it. • But if you like money, choose the right language. ◦ We cut our server requirements by 90% by switching
  10. KASS • Keep your API as simple as possible. •

    The more features you add, the harder it is to scale. • Consider every feature and how it will affect things, most importantly your data store ◦ ie: inspect the queries
  11. 3 or more servers for every layer • 1: No!

    • 2: better, but not worth the risk. • 3+: bingo Bonus points: Put one in a different zone to go for high availability.
  12. Cache stuff to take load off the data store and

    improve performance Your database does a lot of work. Take some load off it by caching things. If it's something that is looked up often, like checking authentication, cache it for a short period of time, even if it's just 30 seconds.
  13. Queue up everything that doesn't need to be done synchronously

    • Return the required information to the client as fast as possible, queue up the rest. • Stats, logs, notifications, etc. • Put messages in queue, let other servers (worker servers) deal with the messages on their own time. Note: there's this really awesome message queue I heard about called IronMQ.
  14. Automate everything • If you have to SSH into your

    boxes, you're doing it wrong. • You should be able to launch servers that self configure and get added to the resource pool automatically. • If you find yourself firing up your SSH client, fix your scripts instead and then launch new servers.
  15. Practice scaling • Don't expect things to just work. •

    Setup a staging environment • Launch servers often. • Terminate servers often. • You should be VERY comfortable with killing and launching servers.
  16. Test everything Scale testing: • Run load tests, figure out

    what you can handle • Add more resources • Repeat... Production testing: • Test your API's all the time, CI is not enough ◦ Anything in your system can fail, you should find it before your users
  17. Monitor everything • Setup Pingdom to check your API. ◦

    Make sure it hits an endpoint that touches your database. • Install monitoring daemons on your servers and collect that data somewhere. ◦ Librato or Datadog are good • Collect key metrics ◦ Throw them at StatHat or Librato • SETUP ALERTS!!! ◦ The graphs are nice, but you need to know immediately when something goes wrong.