Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling Cloud Servers for Seasonal Traffic – ClearTax

Scaling Cloud Servers for Seasonal Traffic – ClearTax

I gave this talk at Nasscom's Annual Technology Conference in New Delhi on 12th December, 2014.

ClearTax (YC S14) helps individuals e-File their Income Tax Returns on the web and mobile. The demand curve for our services is very seasonal – with a huge spike in traffic close to 31st July (the main tax filing deadline in India), and at the end of each quarter.

This talk focuses on the key learnings of ClearTax: how to scale a large service in the cloud with a very small startup team.

Feel free to reach out to me if you have any questions / suggestions! I'm _anks on twitter.

PS: ClearTax is hiring engineers & designers. For more info, visit https://cleartax.in/meta/careers

Ankit Solanki

December 20, 2014
Tweet

More Decks by Ankit Solanki

Other Decks in Programming

Transcript

  1. The Spike Most startups have to deal with it ClearTax

    — E-Filing Deadlines Flipkart — Big Billion Sale Amazon — Black Friday Netflix — House of Cards release
  2. The Spike It’s not always predictable (think Slashdot-effect) Can be

    10-100x your usual traffic Probably will not last
  3. Monitoring: Set up Notifications! Learn about outages as they happen

    (and) Learn about usage spikes as they happen
  4. Scaling Your Server Scale-Up ⬍ X X X X X

    Scale-Out ⬌ X X X X X X X X X X X X X X X X X X X X X X X X X X X X
  5. Horizontal Scaling Checklist ✔ Don’t rely on local memory No

    in-memory sessions. Use redis or equivalent
  6. Horizontal Scaling Checklist ✔ One-Click Deployments Rely on platforms PAAS

    – Heroku-like platforms Or something provided by your Cloud Provider (Or, roll your own – Docker, Chef, Puppet, Ansible)
  7. Recommendation for Startups Go with a PAAS when you’re young

    Build the product, don’t do DevOps when you don’t need to!
  8. What ClearTax did in July Added a new sub-system for

    background tasks Infrastructure for bringing up web and background workers, on-demand Logging, monitoring sub-systems put in place
  9. Growing Pains In first week of July, noticed slow response

    times Bottleneck identified: Database
  10. Database From initial analysis, problem was due to locks and

    contention Did a thorough code review Wrote tooling to visualize execution times
  11. Learnings Distributed systems are hard Find bottlenecks as soon as

    possible Real world usage is different from load tests
  12. Devops should monitor more than just hardware With sudden traffic

    — Your payment gateway may go down Your Email provider may start throttling you
  13. Have the basics in place Monitor everything! Things will fail

    – this is expected Attempt fixes by upgrading infrastructure first