Lessons in Scalability

11c2622ec28e63154b5ca87d69bbcfa6?s=47 Manish Gill
September 09, 2017

Lessons in Scalability

In this talk I present some good fundamental principles that you should follow in your journey towards achieving scalability

11c2622ec28e63154b5ca87d69bbcfa6?s=128

Manish Gill

September 09, 2017
Tweet

Transcript

  1. LESSONS GYAAN IN SCALABILITY

  2. ABOUT ME MANISH GILL @MGILL25

  3. LETS CREATE AN ANALYTICS SYSTEM

  4. CREATING AN ANALYTICS SYSTEM ▸ Track and Report Website Traffic

    ▸ Identify users ▸ Define goals ▸ Behavior Analysis ▸ Retroactive queries ▸ Realtime?
  5. None
  6. YOUR TRAFFIC IS NOW MY PROBLEM

  7. MODELING THE DATA ▸ Start with an RDBMS. Proven and

    reliable way of modeling schemas. ▸ Central facts table centered around the User ▸ Joined with other tables to do analytical queries ▸ Number of hits, Number of conversions and so on
  8. DEEP DIVING INTO USER BEHAVIOR ▸ Clickstream data - HTML

    + CSS data capture and visualization ▸ Record an entire user session on site and then replay it later. ▸ “Send an HTTP call when the user moves the mouse.” ▸ Considerable increase in volume
  9. VISUALIZING CLICKSTREAM

  10. YOUR TRAFFIC IS NOW MAKING ME MISERABLE

  11. ITERATION #1 ▸ mkdir my_fancy_analytics_engine ▸ git init ▸ vim

    app.py ▸ “Database? Lemme just use sqlite for now…”
  12. PERILS OF SQLITE ▸ Minimal Concurrency Support. Only for Reading.

    ▸ Database locking ▸ “When any process wants to write, it must lock the entire database file for the duration of its update.” ▸ This results in direct data loss. RIP
  13. DATABASE LOCKS IN ACTION TRANSACTION IN PROGRESS FAILING CONCURRENT INSERT

  14. SCALING SQLITE? ▸ Maybe I can have a lot of

    sqlite DB files instead of one. ▸ Divide them up based on customer_id ▸ Divide them up even further based on time ▸ Daily is too frequent, monthly too infrequent. ▸ account_id/week_{week_number}.db looks good
  15. MEH ▸ My locking problem has still not been fixed

    ▸ Too primitive and bare-bones. ▸ I really don’t wanna manage 10k database files myself. ▸ Please don’t do this.
  16. THINK ABOUT PERF FROM THE START LESSON #1

  17. ITERATION #2 ▸ PostgreSQL! ▸ SQL standard compliant, good JSON

    support. ▸ Parallel query execution ▸ MVCC - Each process has its own snapshot of the DB ▸ Before implementing it, lets do some benchmarking.
  18. DETOUR - BENCHMARKING ▸ Isn’t an exact science. ▸ Wrong:

    “My code takes n seconds to run” ▸ Kinda sorta right: ▸ “My code takes n seconds to run on a 4 core CPU, 12Gig RAM machine which has no other processes running, and has an SSD capable of reaching 4k maximum IOPS. I can optimize it to insert k messages per second.” ▸ TL;DR: It’s complicated.
  19. SYSTEM I/O

  20. BENCHMARKING TIPS ▸ Keep it close to the real use

    case if you can. ▸ Utilities like iostat, iotop, htop, ftop are your friends. ▸ Tweak a config and see the impact. ▸ Don’t use resource-heavy monitoring tools. ▸ Don’t believe blog posts! Do your own benchmarking.
  21. DO EXTENSIVE BENCHMARKING LESSON #2

  22. PORTING LEGACY CODE ▸ “Sqlite to Postgres migration should be

    simple!” ▸ Nope. SQL is just one part of the equation. Different systems work differently. ▸ Still easier than re-writing for a NoSQL system from scratch.
  23. BACK TO ITERATION #2 ▸ We now have a system

    that is ▸ Concurrent. No more database locks. Yay! ▸ Keeps data consistent. ▸ Handles a decent amount of load. ▸ What can go wrong?
  24. PROBLEMS WITH ITERATION #2 ▸ Data insertion performance will drop

    in a few days or weeks. ▸ Or even hours during your load testing. ▸ Huge tables with hundreds of millions of rows. ▸ Worse: Large indexes on these tables.
  25. INDEXES ARE A SCAM! ▸ Great for READ operations ▸

    Suck for WRITE operations. 1 Write = 1 table write + 1 index update. ▸ On disk in PG if can’t store in its shared buffers. ▸ We’ll end up with higher disk I/O.
  26. ITERATION #3 - IMPROVING PG ▸ Avoid indexes in the

    beginning. ▸ Partition one huge table into many smaller tables. PG offers partitioning based on inheritance. ▸ Divide up your data - ▸ random(data) ▸ my_custom_algorithm(data)
  27. TABLE PARTITIONING IN POSTGRESQL MASTER TABLE CHILD 3 CHILD 2

    CHILD 1
  28. QUEUES TO MANAGE VOLUME - KAFKA https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about- real-time-datas-unifying

  29. USE QUEUES PARTITION YOUR DATA LESSON #3

  30. ITERATION #4 ▸ We now have our DB running perfectly

    well…for a while. ▸ What is going to happen really soon?
  31. ITERATION #4 ▸ Run out of disk space! 96% 4%

  32. SHARDING ▸ Shard means “a small part of a whole”.

    ▸ How to choose where should an incoming message go? ▸ database_node = (message_id % 2) + 1 ▸ database_node = random(1, 2)
  33. SHARDING NODE 1 NODE 2 MSG

  34. SHARDING IS UNAVOIDABLE AT SCALE LESSON #4

  35. ITERATION #5 ▸ Our shard can still die. ▸ We

    should at least have read queries enabled for customer reporting. ▸ Replication (Hot Standby)
  36. REPLICATION ▸ Your primary DB should always have a slave

    running with it. ▸ The slave becomes master when the old master dies. ▸ Question: What happens when the old master comes back to life? We now have 2 masters and no slaves!
  37. REPLICATION MASTER 1 MASTER 2 MSG SLAVE 2 SLAVE 1

  38. USE REPLICATION LESSON #5

  39. ITERATION #6 ▸ We now have a cluster of DB

    nodes ▸ Each node has a replica ▸ We are HA or “Highly Available”. ▸ Can something still go wrong? Why do you hate me?
  40. ITERATION #6 ▸ On master machine, run the following: ▸

    “DROP DATABASE” ▸ What happens to slave ? It friggin’ deletes everything too. ▸ Slaves are Dumb!
  41. BACKUPS! ▸ Replicas are not backups ▸ Take regular backups

    of your entire database. ▸ PG offers fantastic support for base backups + WAL archiving. ▸ An untested backup is no backup at all.
  42. BACKUPS ARE GOING TO SAVE YOU LESSON #6

  43. “PREMATURE OPTIMIZATION IS THE ROOT OF ALL EVIL” Donald Knuth

  44. “THAT KNUTH QUOTE IS USELESS” Manish Gill

  45. 1. THINK ABOUT PERF FROM THE START 2. DO EXTENSIVE

    BENCHMARKING 3. USE QUEUES AND PARTITION YOUR DATA 4. SHARDING IS UNAVOIDABLE AT SCALE 5. USE REPLICATION 6. BACKUPS ARE GOING TO SAVE YOU LESSONS IN SCALABILITY
  46. QUESTIONS?