Slide 1

Slide 1 text

Mike McNeil, creator & BDFL · Jun 14, 2023 · sailsconf.com at scale How do you scale a Sails/Node.js app? How soon? Why? How will you know? What's it like? What do you run into? What's likely to be my real problem? How much traf fi c can I handle? How slow can it be? How many engineers can collaborate? What does "scale" even mean?

Slide 2

Slide 2 text

@mikermcneil Hi I'm Mike! ^Mike

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

sailsjs.com/features »

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Sails at scale @mikermcneil • Starting assumptions • Is this advice any good? • What we'll cover

Slide 8

Slide 8 text

Starting assumptions

Slide 9

Slide 9 text

Starting assumptions • It is important to you that your app works. You want users to have a good time. • You want to produce value. You want to focus on the right things. You dislike waste. You want to iterate and move quickly. • Whether you currently handle 5 requests/min or 10k+, you hope to grow traf fi c further. • You plan to add features and make changes to your app in the future. • You expect to continue growing the size of your development team over time.

Slide 10

Slide 10 text

Starting assumptions • You de fi ne acceptable performance as "fast enough for my users' needs". You are practical. • Building something users want > winning pissing contests. • You know the dangers of premature optimization; how it leads to complexity risk and bogs down future changes. • You are willing to be unpopular*. And brave. And occasionally right. 
 * You want to learn what is true, not listen to the most con fi dent voice in the room... even when it's yours.

Slide 11

Slide 11 text

Is this advice any good?

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

(from a resume gist in 2019 when I was badly in need of cash)

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Is this advice any good? 2023: Sails @ Fleet, powering my company's website, business operations, and customers 2023: Sails @ F1000 EV company 2023: Sails @ Fastly (FSLY) 2022: Sails @ F1000 transportation company 2021: Sails @ Paystack (Stripe) 2020: Sails @ Partech (PAR), a few others 2019: Hapi @ Walmart Labs, several others 2018: Sails @ 0 LoC => $1M revenue real estate insurance startup, several others 2017: Sails @ ≥4 YC startups, many others 2016: Perf overhaul in Sails and Waterline core 2015: Sails @ Treeline (TechCrunch / HackerNews launch traf fi c) 2015: Sails @ Shyp ("Don't use Sails or Waterline" article) 2014: Sails @ Postman, Digium, Snap Kitchen, many others 2013: Sails @ F1000 telco, Reuters, many others 2012: Sails @ enterprise cloud drive startup (deploy-it-yourself Dropbox on OpenStack) @mikermcneil

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

🏴☠ What is scale? 🏴☠ Scalable by default 🏴☠ Measuring scalability 🏴☠ Scaling, actually 🏴☠ Scaling the Sails framework 🏴☠ Prioritizing scale What we'll cover Sails at scale

Slide 21

Slide 21 text

🏴☠ What is scale?

Slide 22

Slide 22 text

🏴☠ What is scale? A journey in 4 dimensions • Latency • Concurrency ("true" scalability) • Stability • Maintainability

Slide 23

Slide 23 text

(check out the order that Ezra listed these in) (and that he didn't mention latency)

Slide 24

Slide 24 text

🏴☠ What is scale? A journey in 4 dimensions • All 4 dimensions are important. • Stability is the most important. • A lot of "scaling" problems are actually stability problems. • https://npmjs.com/package/sails-hook-dev • Focus on contributor experience, error messages, stability

Slide 25

Slide 25 text

🏴☠ Scalable by default

Slide 26

Slide 26 text

🏴☠ Scalable by default The design ambition behind Sails • Usability impacts scalability. • Developer experience • "Userland developer experience" (e.g. node.js dev) • "API usage experience" (e.g. frontend dev) • "Operator experience" (e.g. devops or SRE or the node.js dev) • "Inheritee experience" (e.g. the node.js dev who takes over if you are bussed)

Slide 27

Slide 27 text

🏴☠ Measuring scalability

Slide 28

Slide 28 text

🏴☠ Measuring scalability Benchmarking, load testing, and the art of the YOLO • Benchmarking • When is benchmarking useful? • What does a benchmark actually tell you? • Benchmarking concurrency vs. latency • Load testing • YOLO

Slide 29

Slide 29 text

🏴☠ Scaling, actually

Slide 30

Slide 30 text

🏴☠ Scaling, actually A look at the bottlenecks; what it's actually like • What actually happens when you launch a thing... • ...and it grows?

Slide 31

Slide 31 text

🏴☠ Scaling, actually A look at the bottlenecks; what it's actually like A. Productionization B. Launch C. Changes D. Traf fi c E. Team

Slide 32

Slide 32 text

🏴☠ Scaling, actually A. Productionization • Reality sets in. Even early friendly users fi nd bugs you wouldn't expect. • This is why it is important to launch quickly and iterate with small, frequent changes. • The quicker you launch to real users, the easier productionization will be. • Launch quickly, minimize "surprises".

Slide 33

Slide 33 text

🏴☠ Scaling, actually B. Launch • Real world product launch means people are using it. • 5xx errors are coming in • Bug reports are coming in • You realize you don't have the admin tools you need • Most obvious product design fl aws quickly become clear • Early performance issues due to data volume or inef fi cient queries arise • The users show up when you least expect them

Slide 34

Slide 34 text

🏴☠ Scaling, actually C. Changes • Hopefully already have code reviews • Deploying changes, database migrations • API compatibility • Deprecated or vulnerable dependencies • Changes start to become more intentional as you realize the endless work to do

Slide 35

Slide 35 text

🏴☠ Scaling, actually D. Traf fi c • Start to have substantial traf fi c (lucky, or good product, or you're giving away free boxes of Wheat Thins) • Not just traf fi c, but also time and varied usage to explore the nooks and crannies: • unusual errors trigger never-before-run code paths, trip on typos that somehow escaped the linter • 3rd party APIs go down (you start to notice when Stripe goes down for 19 minutes) • a bug corrupts database records, and now you need run a script to fi x it • seldom-used features, edge cases, rare combos of these situations • Eventually, new stuff starts to matter. You might realize that there are features that get way too slow under hotter usage, or run into new bugs that only get caused by speci fi c sequences of events (e.g. a bunch requests to generate PDFs came in all at once) • Infrastructure cost begins to matter, at least a little bit.

Slide 36

Slide 36 text

🏴☠ Scaling, actually E. Team • You start adding engineers. Engineering managers are hired. Process happens (agile or similar). • Unplanned work starts to become an issue. You make work visible. • Hopefully design culture already established (doing wireframe- fi rst or similar) • Pressure mounts to make more repos; you resist. You invest in training. • Someone who is talking regularly with customers (product manager) is doing the prioritizing, with close involvement of design and engineering. Product groups form. • CI/CD grows in utility (and complexity). You turn on branch protections. Set expectation of fast PR review times, start to measure KPIs about PR open time and bug open time. • You implement CODEOWNERS to put DRIs (directly responsible individuals) in charge of approving changes to certain key repo paths.

Slide 37

Slide 37 text

🏴☠ Scaling the Sails framework

Slide 38

Slide 38 text

🏴☠ Scaling Sails apps How the framework can help (or hurt) A. Request lifecycle and "automatic" server features B. Userland code C. Databases, integration, and large datasets

Slide 39

Slide 39 text

🏴☠ Scaling Sails apps A. Request lifecycle and "automatic" server features • req/res • Express • body parser • 50ms

Slide 40

Slide 40 text

🏴☠ Scaling Sails apps B. Userland code • async/await • error handling (try/catch), .intercept(), .tolerate() • error messages and warnings (e.g. "Did you forget to use await?") • fl ow control (sails.helpers. fl ow) • .retry(), .timeout() • middleware » routes and policies » actions2 • services » helpers

Slide 41

Slide 41 text

🏴☠ Scaling Sails apps B. Userland code • parley (the great optimization project) • organics (sails.helpers.http, etc) • Cloud SDK (simple and automated, reducing buggy code that sends unnecessary/accidental requests)

Slide 42

Slide 42 text

🏴☠ Scaling Sails apps C. Databases, integrations, and large datasets • O(n) memory (cursors with .stream() ) • O(n) network or disk calls (look for `await` and loops) • 3rd party APIs • Database • Example: @eashaw recently optimized performance of a Sails vulnerability management dashboard for a publicly-traded customer in the cloud hosting space. • First fi x the ∞s • Then fi x the intolerables (e.g. 30 seconds) • https://github.com/ fl eetdm/ fl eet-vulnerability-dashboard/blob/ 53b8a9caa7798761b929faba0a73b429a57db69c/scripts/update-reports.js#L1-L80

Slide 43

Slide 43 text

🏴☠ Prioritizing scale

Slide 44

Slide 44 text

🏴☠ Prioritizing scale Why bother? When? • Focus on usability. Unusable means "bug". • Bugs fi rst. • Then everything else.

Slide 45

Slide 45 text

🏴☠ Prioritizing scale Why bother? When? • Dogfood it: When you use something yourself, it has to work. You prioritize fi xes and focus on what actually matters for whatever practical goal you're aiming to achieve. (Building a compelling product, generating high adoption, driving revenue, etc) • Get it into out into the world: Community and customers reveal other holes, and then you develop a process around noticing, prioritizing, and fi xing those holes.

Slide 46

Slide 46 text

🏴☠ Announcement

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

benevolent dictator for life (BDFL) lead maintainer @dominus_kelvin @mikermcneil

Slide 49

Slide 49 text

Live chat (AMA)

Slide 50

Slide 50 text

Recap 🏴☠ What is scale? 🏴☠ Scalable by default 🏴☠ Measuring scalability 🏴☠ Scaling, actually 🏴☠ Scaling Sails apps 🏴☠ Prioritizing scale "A journey in 4 dimensions" "The design ambition behind Sails" "Benchmarking vs. load testing vs. YOLO" "The bottlenecks; what it's actually like" "How the framework can help (or hurt)" "Why bother? When?"

Slide 51

Slide 51 text

@mikermcneil CEO, Fleet (open source platform for security and IT teams with thousands of computers) [?] fl eetdm.com/handbook/company linkedin.com/in/mikermcneil github.com/mikermcneil twitter.com/mikermcneil Sails and Fleet artwork by Michael Thomas.