Sails at scale

Mike McNeil, creator & BDFL · Jun 14, 2023 ·
sailsconf.com at scale How do you scale a Sails/Node.js app? How soon? Why? How will you know? What's it like? What do you run into? What's likely to be my real problem? How much traf fi c can I handle? How slow can it be? How many engineers can collaborate? What does "scale" even mean?

@mikermcneil Hi I'm Mike! ^Mike

sailsjs.com/features »

Sails at scale @mikermcneil • Starting assumptions • Is this
advice any good? • What we'll cover

Starting assumptions

Starting assumptions • It is important to you that your
app works. You want users to have a good time. • You want to produce value. You want to focus on the right things. You dislike waste. You want to iterate and move quickly. • Whether you currently handle 5 requests/min or 10k+, you hope to grow traf fi c further. • You plan to add features and make changes to your app in the future. • You expect to continue growing the size of your development team over time.

Starting assumptions • You de fi ne acceptable performance as
"fast enough for my users' needs". You are practical. • Building something users want > winning pissing contests. • You know the dangers of premature optimization; how it leads to complexity risk and bogs down future changes. • You are willing to be unpopular*. And brave. And occasionally right.   * You want to learn what is true, not listen to the most con fi dent voice in the room... even when it's yours.

Is this advice any good?

(from a resume gist in 2019 when I was badly
in need of cash)

Is this advice any good? 2023: Sails @ Fleet, powering
my company's website, business operations, and customers 2023: Sails @ F1000 EV company 2023: Sails @ Fastly (FSLY) 2022: Sails @ F1000 transportation company 2021: Sails @ Paystack (Stripe) 2020: Sails @ Partech (PAR), a few others 2019: Hapi @ Walmart Labs, several others 2018: Sails @ 0 LoC => $1M revenue real estate insurance startup, several others 2017: Sails @ ≥4 YC startups, many others 2016: Perf overhaul in Sails and Waterline core 2015: Sails @ Treeline (TechCrunch / HackerNews launch traf fi c) 2015: Sails @ Shyp ("Don't use Sails or Waterline" article) 2014: Sails @ Postman, Digium, Snap Kitchen, many others 2013: Sails @ F1000 telco, Reuters, many others 2012: Sails @ enterprise cloud drive startup (deploy-it-yourself Dropbox on OpenStack) @mikermcneil

🏴☠ What is scale? 🏴☠ Scalable by default 🏴☠ Measuring
scalability 🏴☠ Scaling, actually 🏴☠ Scaling the Sails framework 🏴☠ Prioritizing scale What we'll cover Sails at scale

🏴☠ What is scale?

🏴☠ What is scale? A journey in 4 dimensions •
Latency • Concurrency ("true" scalability) • Stability • Maintainability

(check out the order that Ezra listed these in) (and
that he didn't mention latency)

🏴☠ What is scale? A journey in 4 dimensions •
All 4 dimensions are important. • Stability is the most important. • A lot of "scaling" problems are actually stability problems. • https://npmjs.com/package/sails-hook-dev • Focus on contributor experience, error messages, stability

🏴☠ Scalable by default

🏴☠ Scalable by default The design ambition behind Sails •
Usability impacts scalability. • Developer experience • "Userland developer experience" (e.g. node.js dev) • "API usage experience" (e.g. frontend dev) • "Operator experience" (e.g. devops or SRE or the node.js dev) • "Inheritee experience" (e.g. the node.js dev who takes over if you are bussed)

🏴☠ Measuring scalability

🏴☠ Measuring scalability Benchmarking, load testing, and the art of
the YOLO • Benchmarking • When is benchmarking useful? • What does a benchmark actually tell you? • Benchmarking concurrency vs. latency • Load testing • YOLO

🏴☠ Scaling, actually

🏴☠ Scaling, actually A look at the bottlenecks; what it's
actually like • What actually happens when you launch a thing... • ...and it grows?

🏴☠ Scaling, actually A look at the bottlenecks; what it's
actually like A. Productionization B. Launch C. Changes D. Traf fi c E. Team

🏴☠ Scaling, actually A. Productionization • Reality sets in. Even
early friendly users fi nd bugs you wouldn't expect. • This is why it is important to launch quickly and iterate with small, frequent changes. • The quicker you launch to real users, the easier productionization will be. • Launch quickly, minimize "surprises".

🏴☠ Scaling, actually B. Launch • Real world product launch
means people are using it. • 5xx errors are coming in • Bug reports are coming in • You realize you don't have the admin tools you need • Most obvious product design fl aws quickly become clear • Early performance issues due to data volume or inef fi cient queries arise • The users show up when you least expect them

🏴☠ Scaling, actually C. Changes • Hopefully already have code
reviews • Deploying changes, database migrations • API compatibility • Deprecated or vulnerable dependencies • Changes start to become more intentional as you realize the endless work to do

🏴☠ Scaling, actually D. Traf fi c • Start to
have substantial traf fi c (lucky, or good product, or you're giving away free boxes of Wheat Thins) • Not just traf fi c, but also time and varied usage to explore the nooks and crannies: • unusual errors trigger never-before-run code paths, trip on typos that somehow escaped the linter • 3rd party APIs go down (you start to notice when Stripe goes down for 19 minutes) • a bug corrupts database records, and now you need run a script to fi x it • seldom-used features, edge cases, rare combos of these situations • Eventually, new stuff starts to matter. You might realize that there are features that get way too slow under hotter usage, or run into new bugs that only get caused by speci fi c sequences of events (e.g. a bunch requests to generate PDFs came in all at once) • Infrastructure cost begins to matter, at least a little bit.

🏴☠ Scaling, actually E. Team • You start adding engineers.
Engineering managers are hired. Process happens (agile or similar). • Unplanned work starts to become an issue. You make work visible. • Hopefully design culture already established (doing wireframe- fi rst or similar) • Pressure mounts to make more repos; you resist. You invest in training. • Someone who is talking regularly with customers (product manager) is doing the prioritizing, with close involvement of design and engineering. Product groups form. • CI/CD grows in utility (and complexity). You turn on branch protections. Set expectation of fast PR review times, start to measure KPIs about PR open time and bug open time. • You implement CODEOWNERS to put DRIs (directly responsible individuals) in charge of approving changes to certain key repo paths.

🏴☠ Scaling the Sails framework

🏴☠ Scaling Sails apps How the framework can help (or
hurt) A. Request lifecycle and "automatic" server features B. Userland code C. Databases, integration, and large datasets

🏴☠ Scaling Sails apps A. Request lifecycle and "automatic" server
features • req/res • Express • body parser • 50ms

🏴☠ Scaling Sails apps B. Userland code • async/await •
error handling (try/catch), .intercept(), .tolerate() • error messages and warnings (e.g. "Did you forget to use await?") • fl ow control (sails.helpers. fl ow) • .retry(), .timeout() • middleware » routes and policies » actions2 • services » helpers

🏴☠ Scaling Sails apps B. Userland code • parley (the
great optimization project) • organics (sails.helpers.http, etc) • Cloud SDK (simple and automated, reducing buggy code that sends unnecessary/accidental requests)

🏴☠ Scaling Sails apps C. Databases, integrations, and large datasets
• O(n) memory (cursors with .stream() ) • O(n) network or disk calls (look for `await` and loops) • 3rd party APIs • Database • Example: @eashaw recently optimized performance of a Sails vulnerability management dashboard for a publicly-traded customer in the cloud hosting space. • First fi x the ∞s • Then fi x the intolerables (e.g. 30 seconds) • https://github.com/ fl eetdm/ fl eet-vulnerability-dashboard/blob/ 53b8a9caa7798761b929faba0a73b429a57db69c/scripts/update-reports.js#L1-L80

🏴☠ Prioritizing scale

🏴☠ Prioritizing scale Why bother? When? • Focus on usability.
Unusable means "bug". • Bugs fi rst. • Then everything else.

🏴☠ Prioritizing scale Why bother? When? • Dogfood it: When
you use something yourself, it has to work. You prioritize fi xes and focus on what actually matters for whatever practical goal you're aiming to achieve. (Building a compelling product, generating high adoption, driving revenue, etc) • Get it into out into the world: Community and customers reveal other holes, and then you develop a process around noticing, prioritizing, and fi xing those holes.

🏴☠ Announcement

benevolent dictator for life (BDFL) lead maintainer @dominus_kelvin @mikermcneil

Live chat (AMA)

Recap 🏴☠ What is scale? 🏴☠ Scalable by default 🏴☠
Measuring scalability 🏴☠ Scaling, actually 🏴☠ Scaling Sails apps 🏴☠ Prioritizing scale "A journey in 4 dimensions" "The design ambition behind Sails" "Benchmarking vs. load testing vs. YOLO" "The bottlenecks; what it's actually like" "How the framework can help (or hurt)" "Why bother? When?"

</Sails at scale> @mikermcneil CEO, Fleet (open source platform for
security and IT teams with thousands of computers) [?] fl eetdm.com/handbook/company linkedin.com/in/mikermcneil github.com/mikermcneil twitter.com/mikermcneil Sails and Fleet artwork by Michael Thomas.

Sails at scale

Sails at scale

More Decks by Mike McNeil

Other Decks in Programming

Featured

Transcript