Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sails at scale

Sails at scale

#sailsconf2023

📺 Watch: https://www.youtube.com/watch?v=KSa5PoKNPeQ

Other Sailsconf 2023 videos: https://www.youtube.com/@sailscasts

The talks from Sailsconf 2021 and Sailsconf 2022 are also available on YouTube, including Mike's keynotes.

(The decks are somewhere. TODO: ask Kelvin if he has a copy, or hunt them down, and upload em)

Mike McNeil

June 14, 2023
Tweet

More Decks by Mike McNeil

Other Decks in Programming

Transcript

  1. Mike McNeil, creator & BDFL · Jun 14, 2023 ·

    sailsconf.com at scale How do you scale a Sails/Node.js app? How soon? Why? How will you know? What's it like? What do you run into? What's likely to be my real problem? How much traf fi c can I handle? How slow can it be? How many engineers can collaborate? What does "scale" even mean?
  2. Starting assumptions • It is important to you that your

    app works. You want users to have a good time. • You want to produce value. You want to focus on the right things. You dislike waste. You want to iterate and move quickly. • Whether you currently handle 5 requests/min or 10k+, you hope to grow traf fi c further. • You plan to add features and make changes to your app in the future. • You expect to continue growing the size of your development team over time.
  3. Starting assumptions • You de fi ne acceptable performance as

    "fast enough for my users' needs". You are practical. • Building something users want > winning pissing contests. • You know the dangers of premature optimization; how it leads to complexity risk and bogs down future changes. • You are willing to be unpopular*. And brave. And occasionally right. 
 * You want to learn what is true, not listen to the most con fi dent voice in the room... even when it's yours.
  4. Is this advice any good? 2023: Sails @ Fleet, powering

    my company's website, business operations, and customers 2023: Sails @ F1000 EV company 2023: Sails @ Fastly (FSLY) 2022: Sails @ F1000 transportation company 2021: Sails @ Paystack (Stripe) 2020: Sails @ Partech (PAR), a few others 2019: Hapi @ Walmart Labs, several others 2018: Sails @ 0 LoC => $1M revenue real estate insurance startup, several others 2017: Sails @ ≥4 YC startups, many others 2016: Perf overhaul in Sails and Waterline core 2015: Sails @ Treeline (TechCrunch / HackerNews launch traf fi c) 2015: Sails @ Shyp ("Don't use Sails or Waterline" article) 2014: Sails @ Postman, Digium, Snap Kitchen, many others 2013: Sails @ F1000 telco, Reuters, many others 2012: Sails @ enterprise cloud drive startup (deploy-it-yourself Dropbox on OpenStack) @mikermcneil
  5. 🏴☠ What is scale? 🏴☠ Scalable by default 🏴☠ Measuring

    scalability 🏴☠ Scaling, actually 🏴☠ Scaling the Sails framework 🏴☠ Prioritizing scale What we'll cover Sails at scale
  6. 🏴☠ What is scale? A journey in 4 dimensions •

    Latency • Concurrency ("true" scalability) • Stability • Maintainability
  7. 🏴☠ What is scale? A journey in 4 dimensions •

    All 4 dimensions are important. • Stability is the most important. • A lot of "scaling" problems are actually stability problems. • https://npmjs.com/package/sails-hook-dev • Focus on contributor experience, error messages, stability
  8. 🏴☠ Scalable by default The design ambition behind Sails •

    Usability impacts scalability. • Developer experience • "Userland developer experience" (e.g. node.js dev) • "API usage experience" (e.g. frontend dev) • "Operator experience" (e.g. devops or SRE or the node.js dev) • "Inheritee experience" (e.g. the node.js dev who takes over if you are bussed)
  9. 🏴☠ Measuring scalability Benchmarking, load testing, and the art of

    the YOLO • Benchmarking • When is benchmarking useful? • What does a benchmark actually tell you? • Benchmarking concurrency vs. latency • Load testing • YOLO
  10. 🏴☠ Scaling, actually A look at the bottlenecks; what it's

    actually like • What actually happens when you launch a thing... • ...and it grows?
  11. 🏴☠ Scaling, actually A look at the bottlenecks; what it's

    actually like A. Productionization B. Launch C. Changes D. Traf fi c E. Team
  12. 🏴☠ Scaling, actually A. Productionization • Reality sets in. Even

    early friendly users fi nd bugs you wouldn't expect. • This is why it is important to launch quickly and iterate with small, frequent changes. • The quicker you launch to real users, the easier productionization will be. • Launch quickly, minimize "surprises".
  13. 🏴☠ Scaling, actually B. Launch • Real world product launch

    means people are using it. • 5xx errors are coming in • Bug reports are coming in • You realize you don't have the admin tools you need • Most obvious product design fl aws quickly become clear • Early performance issues due to data volume or inef fi cient queries arise • The users show up when you least expect them
  14. 🏴☠ Scaling, actually C. Changes • Hopefully already have code

    reviews • Deploying changes, database migrations • API compatibility • Deprecated or vulnerable dependencies • Changes start to become more intentional as you realize the endless work to do
  15. 🏴☠ Scaling, actually D. Traf fi c • Start to

    have substantial traf fi c (lucky, or good product, or you're giving away free boxes of Wheat Thins) • Not just traf fi c, but also time and varied usage to explore the nooks and crannies: • unusual errors trigger never-before-run code paths, trip on typos that somehow escaped the linter • 3rd party APIs go down (you start to notice when Stripe goes down for 19 minutes) • a bug corrupts database records, and now you need run a script to fi x it • seldom-used features, edge cases, rare combos of these situations • Eventually, new stuff starts to matter. You might realize that there are features that get way too slow under hotter usage, or run into new bugs that only get caused by speci fi c sequences of events (e.g. a bunch requests to generate PDFs came in all at once) • Infrastructure cost begins to matter, at least a little bit.
  16. 🏴☠ Scaling, actually E. Team • You start adding engineers.

    Engineering managers are hired. Process happens (agile or similar). • Unplanned work starts to become an issue. You make work visible. • Hopefully design culture already established (doing wireframe- fi rst or similar) • Pressure mounts to make more repos; you resist. You invest in training. • Someone who is talking regularly with customers (product manager) is doing the prioritizing, with close involvement of design and engineering. Product groups form. • CI/CD grows in utility (and complexity). You turn on branch protections. Set expectation of fast PR review times, start to measure KPIs about PR open time and bug open time. • You implement CODEOWNERS to put DRIs (directly responsible individuals) in charge of approving changes to certain key repo paths.
  17. 🏴☠ Scaling Sails apps How the framework can help (or

    hurt) A. Request lifecycle and "automatic" server features B. Userland code C. Databases, integration, and large datasets
  18. 🏴☠ Scaling Sails apps A. Request lifecycle and "automatic" server

    features • req/res • Express • body parser • 50ms
  19. 🏴☠ Scaling Sails apps B. Userland code • async/await •

    error handling (try/catch), .intercept(), .tolerate() • error messages and warnings (e.g. "Did you forget to use await?") • fl ow control (sails.helpers. fl ow) • .retry(), .timeout() • middleware » routes and policies » actions2 • services » helpers
  20. 🏴☠ Scaling Sails apps B. Userland code • parley (the

    great optimization project) • organics (sails.helpers.http, etc) • Cloud SDK (simple and automated, reducing buggy code that sends unnecessary/accidental requests)
  21. 🏴☠ Scaling Sails apps C. Databases, integrations, and large datasets

    • O(n) memory (cursors with .stream() ) • O(n) network or disk calls (look for `await` and loops) • 3rd party APIs • Database • Example: @eashaw recently optimized performance of a Sails vulnerability management dashboard for a publicly-traded customer in the cloud hosting space. • First fi x the ∞s • Then fi x the intolerables (e.g. 30 seconds) • https://github.com/ fl eetdm/ fl eet-vulnerability-dashboard/blob/ 53b8a9caa7798761b929faba0a73b429a57db69c/scripts/update-reports.js#L1-L80
  22. 🏴☠ Prioritizing scale Why bother? When? • Focus on usability.

    Unusable means "bug". • Bugs fi rst. • Then everything else.
  23. 🏴☠ Prioritizing scale Why bother? When? • Dogfood it: When

    you use something yourself, it has to work. You prioritize fi xes and focus on what actually matters for whatever practical goal you're aiming to achieve. (Building a compelling product, generating high adoption, driving revenue, etc) • Get it into out into the world: Community and customers reveal other holes, and then you develop a process around noticing, prioritizing, and fi xing those holes.
  24. Recap 🏴☠ What is scale? 🏴☠ Scalable by default 🏴☠

    Measuring scalability 🏴☠ Scaling, actually 🏴☠ Scaling Sails apps 🏴☠ Prioritizing scale "A journey in 4 dimensions" "The design ambition behind Sails" "Benchmarking vs. load testing vs. YOLO" "The bottlenecks; what it's actually like" "How the framework can help (or hurt)" "Why bother? When?"
  25. </Sails at scale> @mikermcneil CEO, Fleet (open source platform for

    security and IT teams with thousands of computers) [?] fl eetdm.com/handbook/company linkedin.com/in/mikermcneil github.com/mikermcneil twitter.com/mikermcneil Sails and Fleet artwork by Michael Thomas.