Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lessons in Scalability

Manish Gill
September 09, 2017

Lessons in Scalability

In this talk I present some good fundamental principles that you should follow in your journey towards achieving scalability

Manish Gill

September 09, 2017
Tweet

Other Decks in Technology

Transcript

  1. CREATING AN ANALYTICS SYSTEM ▸ Track and Report Website Traffic

    ▸ Identify users ▸ Define goals ▸ Behavior Analysis ▸ Retroactive queries ▸ Realtime?
  2. MODELING THE DATA ▸ Start with an RDBMS. Proven and

    reliable way of modeling schemas. ▸ Central facts table centered around the User ▸ Joined with other tables to do analytical queries ▸ Number of hits, Number of conversions and so on
  3. DEEP DIVING INTO USER BEHAVIOR ▸ Clickstream data - HTML

    + CSS data capture and visualization ▸ Record an entire user session on site and then replay it later. ▸ “Send an HTTP call when the user moves the mouse.” ▸ Considerable increase in volume
  4. ITERATION #1 ▸ mkdir my_fancy_analytics_engine ▸ git init ▸ vim

    app.py ▸ “Database? Lemme just use sqlite for now…”
  5. PERILS OF SQLITE ▸ Minimal Concurrency Support. Only for Reading.

    ▸ Database locking ▸ “When any process wants to write, it must lock the entire database file for the duration of its update.” ▸ This results in direct data loss. RIP
  6. SCALING SQLITE? ▸ Maybe I can have a lot of

    sqlite DB files instead of one. ▸ Divide them up based on customer_id ▸ Divide them up even further based on time ▸ Daily is too frequent, monthly too infrequent. ▸ account_id/week_{week_number}.db looks good
  7. MEH ▸ My locking problem has still not been fixed

    ▸ Too primitive and bare-bones. ▸ I really don’t wanna manage 10k database files myself. ▸ Please don’t do this.
  8. ITERATION #2 ▸ PostgreSQL! ▸ SQL standard compliant, good JSON

    support. ▸ Parallel query execution ▸ MVCC - Each process has its own snapshot of the DB ▸ Before implementing it, lets do some benchmarking.
  9. DETOUR - BENCHMARKING ▸ Isn’t an exact science. ▸ Wrong:

    “My code takes n seconds to run” ▸ Kinda sorta right: ▸ “My code takes n seconds to run on a 4 core CPU, 12Gig RAM machine which has no other processes running, and has an SSD capable of reaching 4k maximum IOPS. I can optimize it to insert k messages per second.” ▸ TL;DR: It’s complicated.
  10. BENCHMARKING TIPS ▸ Keep it close to the real use

    case if you can. ▸ Utilities like iostat, iotop, htop, ftop are your friends. ▸ Tweak a config and see the impact. ▸ Don’t use resource-heavy monitoring tools. ▸ Don’t believe blog posts! Do your own benchmarking.
  11. PORTING LEGACY CODE ▸ “Sqlite to Postgres migration should be

    simple!” ▸ Nope. SQL is just one part of the equation. Different systems work differently. ▸ Still easier than re-writing for a NoSQL system from scratch.
  12. BACK TO ITERATION #2 ▸ We now have a system

    that is ▸ Concurrent. No more database locks. Yay! ▸ Keeps data consistent. ▸ Handles a decent amount of load. ▸ What can go wrong?
  13. PROBLEMS WITH ITERATION #2 ▸ Data insertion performance will drop

    in a few days or weeks. ▸ Or even hours during your load testing. ▸ Huge tables with hundreds of millions of rows. ▸ Worse: Large indexes on these tables.
  14. INDEXES ARE A SCAM! ▸ Great for READ operations ▸

    Suck for WRITE operations. 1 Write = 1 table write + 1 index update. ▸ On disk in PG if can’t store in its shared buffers. ▸ We’ll end up with higher disk I/O.
  15. ITERATION #3 - IMPROVING PG ▸ Avoid indexes in the

    beginning. ▸ Partition one huge table into many smaller tables. PG offers partitioning based on inheritance. ▸ Divide up your data - ▸ random(data) ▸ my_custom_algorithm(data)
  16. ITERATION #4 ▸ We now have our DB running perfectly

    well…for a while. ▸ What is going to happen really soon?
  17. SHARDING ▸ Shard means “a small part of a whole”.

    ▸ How to choose where should an incoming message go? ▸ database_node = (message_id % 2) + 1 ▸ database_node = random(1, 2)
  18. ITERATION #5 ▸ Our shard can still die. ▸ We

    should at least have read queries enabled for customer reporting. ▸ Replication (Hot Standby)
  19. REPLICATION ▸ Your primary DB should always have a slave

    running with it. ▸ The slave becomes master when the old master dies. ▸ Question: What happens when the old master comes back to life? We now have 2 masters and no slaves!
  20. ITERATION #6 ▸ We now have a cluster of DB

    nodes ▸ Each node has a replica ▸ We are HA or “Highly Available”. ▸ Can something still go wrong? Why do you hate me?
  21. ITERATION #6 ▸ On master machine, run the following: ▸

    “DROP DATABASE” ▸ What happens to slave ? It friggin’ deletes everything too. ▸ Slaves are Dumb!
  22. BACKUPS! ▸ Replicas are not backups ▸ Take regular backups

    of your entire database. ▸ PG offers fantastic support for base backups + WAL archiving. ▸ An untested backup is no backup at all.
  23. 1. THINK ABOUT PERF FROM THE START 2. DO EXTENSIVE

    BENCHMARKING 3. USE QUEUES AND PARTITION YOUR DATA 4. SHARDING IS UNAVOIDABLE AT SCALE 5. USE REPLICATION 6. BACKUPS ARE GOING TO SAVE YOU LESSONS IN SCALABILITY