Scaling Your Node.JS API Like a Boss

Transcript

NODE JS API your …like a boss Scaling http://bit.ly/nodejs-rocks Volkan

Özçelik March, 7, 2016 http://volkan.io/ @linkibol v0lkan

API Scalability (in theory)

API Scalability (in practice) nothing is linear $#!% will eventually

happen!

About Me • Volkan Özçelik — JavaScript Lover & Performance

Freak • Current: • Technical Lead @ Cisco • Before: • Mobile Frontend Engineer @ Jive Software • VP of Technology @ grou.ps (now GymGroups) • CTO @ cember.net (acquired by Xing )  • Chase Me:  @linkibol    v0lkan

volkan.io Slides & Source Code

Agenda • Node’s Strengths and Weaknesses • Tweaking Our OS

• Throughput, Concurrency, Latency • Scale a Real-Life Node App

How do I Architect   a Scalable and Consistent Node.JS

API   with Manageable Complexity? In a Nutshell…

How do I Architect   a Scalable and Consistent Node.JS

Don’t Fight Windmills • Keep things simpler. • Build something

that’s good enough for your purpose. • Solve for the problems that are actually on your plate.

• Monitor All The Things • Collect Metrics • Form

a Hypothesis • Gather Evidence • Validate Your Hypothesis • Take Corrective Action If Needed Don’t Invent Problems That You Don’t Have (Yet)

Goals • Minimize Client Response Time • Maximize Resource Efﬁciency

on the Server  Hint: Leave 50% of the memory unused  (for taking core dumps)

High-Level Topology of an API Service API Service Load Balancer

SSL Termination Load Balancing API Gateway Authentication Authorization Token Exchange Rate Limiting … HTTP Proxy Clients

High-Level Topology of an API Service API Service Load Balancer

SSL Termination Load Balancing API Gateway Authentication Authorization Token Exchange Rate Limiting … HTTP Proxy Clients

So… JavaScript? JavaScript

Show Love to Functions • Accept JavaScript’s functional and composable

nature. • Avoid `this` and avoid `new` — You’ll thank me later. • Create Focused, Independent, Reusable, and Testable Modules.

“OO leads to anger; Anger leads to hate; Hate leads

to suffering!” Embrace the Difference

Know Your Platform

Node.JS Is Perfect For… • IO-Heavy Applications • Data-Intensive Realtime

Apps • RESTful / API-Driven (Micro)services • Streams • Queued (Lazy) Writes • Processing data on-the-ﬂy https://github.com/libuv/libuv

Node.JS Is not For… • Serving Static Files • CPU-bound

Applications • Creating a Monolithic Infrastructure

Node.JS is not a Swiss Army Knife • Load Balancing

Know Your Bottlenecks • Node.JS serves really well as a

highly concurrent networking app. • Node.JS is very sensitive to memory leaks and blocking code. • 99% of the time you will be IO-bound.

Know the Ecosystem • Do Not Ignore The Ecosystem •

Follow Community News and Updates • Attend to Conferences (like this one) • Know Your Tools and Use Them

Tweaking the OS

Open File Limits "Error: EMFILE, Too many open files" ulimit

-n 60000

Open File Limits "Error: EMFILE, Too many open files" ulimit

-n 60000

Open File Limits "Error: EMFILE, Too many open files" ulimit

-n 60000

Conﬁguring the Load Balancer * See https://bit.ly/nginx-rocks

Additional Tweaks sysctl -w net.core.somaxconn=1024; (default is 128)

Even More Tweaks Do NOT alter anything that you don’t

know! * See   https://www.frozentux.net/ipsysctl-tutorial/ipsysctl-tutorial.html and http://www.tldp.org/LDP/solrhe/Securing-Optimizing-Linux-RH-Edition-v1.3/   for more info.

Security

Common Threats • XSS / CSRF • Input Validation Attack

• DoS / ReDoS • Request Size * Securing Node.JS is not different from securing any other web app. See also: https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project

Security • https://www.owasp.org/index.php/ OWASP_Node_js_Goat_Project • http://nodesecurity.io/ • https://www.owasp.org/index.php/ OWASP_Zed_Attack_Proxy_Project

Do Not Run Node.JS As Root useradd -mrU web  mkdir

/opt/web-app  chown web /opt/web-app  cd /opt/web-app  su web  node app.js  firewall-cmd --permanent --zone=public --add-port=3000/tcp Also, always run Node.JS behind a reverse proxy!

Let’s Get Our Hands Dirty

restify http tcp containers/000-simple-app-restify containers/001-simple-app-http containers/002-simple-app-tcp

restify http tcp ab -n 10000 -c 100 http://app:8000/hello containers/000-simple-app-restify

containers/001-simple-app-http containers/002-simple-app-tcp

ab -n 10000 -c 100 http://app:8000/hello Tested on MacBook Pro,

2.4 GHz Intel Core i5, 16 GB 1600 MHz DDR3 Going Bare Bones

Is It Worth It? • You Can Go Bare-Bones for

Maximum Throughput • Tradeoff: • Harder to maintain • More complex code • Error prone • Lots of edge cases • Harder to use additional tooling

Concurrency

Throughput vs Concurrency

Throughput vs Concurrency linear increase

Throughput vs Concurrency linear increase slowdown

Throughput vs Concurrency linear increase slowdown almost constant

Throughput vs Concurrency linear increase slowdown almost constant rapid decline

Throughput vs Concurrency

Distributed Load Testing Toolbox • Apps • jMeter: http://jmeter.apache.org •

Gatling: http://gatling.io/#/ • The Grinder:   http://grinder.sourceforge.net • Locust:   https://github.com/locustio/locust • “as a service” • ﬂood.io: https://ﬂood.io • loader.io: http://loader.io • LoadImpact: https://loadimpact.com • BlazeMeter: https://www.blazemeter.com • LoadStorm: http://loadstorm.com

Latency

Latency and Throughput

Lessons Learned

Lessons Learned • Latency Kills • Know Your Platform &

Know Your Tools • For maximum throughput go bare bones • Tradeoff: Giving up all the beneﬁts a framework has to offer • Low-level code is harder to maintain: • Harder to Test and Verify / Easier to Create Bugs and Regressions • Corollary: As you add additional layers of abstractions, your API will marginally slow down. • The Inception Rule: More than three levels and you’re lost forever!

Perf Before Scale

Perf Advice Is Addictive Optimization without measurement is futile.

Perf Before Scale • Rule #1:   Avoid premature optimization. 

Do measurements, and optimize what matters. • Tweak Your System for High Performance • Cache All The Things • Cache at every level. • The fastest API response is no response at all. • Delegate Long-Running/CPU-Intensive(*) Operations • Be Lazy Whenever Possible

You can’t optim ize w hat you don’t m easure.

Optimization

Things to Watch Out For • Always Keep an Eye

on the Event Loop • Your API Service may Become CPU-Bound • External API Calls Can Be a Bottleneck • Track Heap Usage Over Time • Implement Sanity Checks • Implement Circuit Breakers • Have an Upper Bound for Concurrency

Things to Watch Out For • Is the app running

and functional? • Is the app overloaded? • How many errors have been raised so far? • Is the app performant (throughput, memory utilization, concurrency)? • Is my cluster healthy? • How many times do forks restored? • Are all clustered forks alive and okay?

Which Will (most of the time) Boil Down to… •

Watching Response Times • Watching CPU Utilization + General Sys Resource Usage • Watching Number of Concurrent Connections

v8 Optimizations

Types of Compilers in v8 • Generic Compiler • Optimizing

Compiler (Crankshaft) • Can Be Two or More Orders of Magnitude Faster See also: * https://wingolog.org/archives/2011/07/05/v8-a-tale-of-two-compilers * http://thibaultlaurens.github.io/javascript/2013/04/29/how-the-v8-engine-works/ * http://www.html5rocks.com/en/tutorials/speed/v8/

X-Ray View Into the v8 Compiler node --trace_opt   --trace_deopt

  --allow-natives-syntax test.js; • console.log(%HasFastProperties(obj)) • console.log(%GetOptimizationStatus(fn)) https://github.com/Nathanaela/v8-natives

X-Ray View Into the v8 Compiler node --trace_opt   --trace_deopt

  --allow-natives-syntax test.js; • console.log(%HasFastProperties(obj)) • console.log(%GetOptimizationStatus(fn)) https://github.com/Nathanaela/v8-natives

Optimize Hot Code Paths Only (Unless You Have a Solid

Evidence to Do Otherwise)

v8 Optimization Killers • Using debugger anywhere within the function.

• Using eval anywhere within the function. • Using with anywhere within the function. • Using try/catch anywhere within the function. * ~via https://github.com/petkaantonov/bluebird/wiki/Optimization-killers

Typical Example: try/catch Inside Function

Isolate try/catch

Perform Lazy Evaluations

Perform Lazy Evaluations <=Sync

Perform Lazy Evaluations <=Async <=Sync

Perform Lazy Evaluations <=Async <=Async <=Sync

Let’s Create Something Real

Let’s Create Something Real • An API that… • Auto-suggest

tags, given a url • Lists related URLs, given a tag

containers/003-the-real-deal

API Service Internet Bastion Simulated by an NGINX static web

server * Fetch HTML oﬀ of websites * Simplify and convert the HTML to plain text * Do NLP/Tokenization on the plain text * Create tags as a result test API Initial Topology

Let’s Test How Our API Performs

Findings • get-tags appear to be CPU-bound. • When get-tags

is being requested, the performance of get-urls becomes two orders of magnitude slower. • get-urls appears to be pretty fast, and it is not CPU bound.

How Can We Be Sure? • Add probes (DTrace, XTrace…

etc)   to trace what’s happening. • Create a REPL to check the app at runtime. containers/004-demo-w-instrumentation

Creating a REPL • You Can Expose Internal State via

an API and/or a CLI/REPL • vantage: https://github.com/dthree/vantage • kang: https://github.com/davepacheco/kang • repl server: https://nodejs.org/api/repl.html • Expose Additional Logging Info at Runtime (in systems that support it) • bunyan -p ( https://github.com/trentm/node-bunyan )

The REPL

Adding Probes

Adding Probes * See https://github.com/v0lkan/kiraz App Node

Adding Probes Bastion Host

Findings (get-tags)

Monitoring Toolbox • Runtime Performance Probing (Kernel-Level Tools) • Linux

Perf Events ( https://perf.wiki.kernel.org/index.php/Main_Page ) perf record -F 71 -p `pgrep -n node` -g -- sleep 30 node --perf_basic_prof_only_functions • Dtrace ( http://dtrace.org/blogs/about/ ) • Tracking Transactions and Tracing Latency • Zipkin ( https://github.com/openzipkin/zipkin ) • Runtime Memory Usage (heap stats, heap difﬁng, leak detection) • Memwatch ( https://github.com/lloyd/node-memwatch ) • See http://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection for details.

Monitoring Toolbox • Runtime Performance Probing (Kernel-Level Tools) • Linux

Monitoring Toolbox • Monitoring “as a service” • nodetime https://nodetime.com/

• newrelic http://newrelic.com/nodejs • strongloop https://strongloop.com/node-js/performance-monitoring/ • keymetrics https://keymetrics.io/ • appdynamics https://www.appdynamics.com/nodejs/ • …

So… Something Is CPU-Intensive • get-urls is CPU-bound and it

also blocks the event loop • What can we do? • Split computationally heavy parts and   fork as child processes and use external libraries. • Create a native Node.JS extension   that does not block the event loop. • Refactor the compute logic into a separate service ﬁrst.

So… Something Is CPU-Intensive • get-urls is CPU-bound and it

also blocks the event loop • What can we do? • Split computationally heavy parts and   fork as child processes and use external libraries. • Create a native Node.JS extension   that does not block the event loop. • Refactor the compute logic into a separate service ﬁrst. app memory worker worker worker child_process

So… Something Is CPU-Intensive • get-urls is CPU-bound and it

Split App and Compute Nodes Compute Service API Service Message

Bus rabbitmq, zeromq, resque etc. see also http://queues.io/ * * containers/005-demo-split-compute

Message Bus Topologies P C Send/Listen P C1 C2 Worker

Queue X C1 PubSub P C2

Message Bus Topologies P C Send/Listen P C1 C2 Worker

Queue X C1 PubSub P C2

Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue

Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

Log Aggregation “G ood developers debug. G reat developers read

logs.”

Aggregate and Rotate Your Log Files Log Aggregator Compute Service

memory API Service memory Message Bus containers/006-demo-eventbus-logaggr

Aggregate and Rotate Your Log Files Log Aggregator Compute Service

Use a Decent Logger • Bunyan ( https://github.com/trentm/node-bunyan ) •

Winston ( https://github.com/winstonjs/winston ) • Log4JS ( https://github.com/nomiddlename/log4js-node )

What to Log • Authentication & Authorization • Session Management

• Method Entry Points • Errors and Weird Events • Speciﬁc Events (startup, shutdown, slowdown etc.) • High-Risk Functionalities (payments, privileges, admins etc)

Log Analysis Toolbox • Loggly ( https://www.loggly.com/ ) • ELK

Stack ( https://www.elastic.co/products ) • Nagios Log Server ( https://www.nagios.com/products/nagios-log-server/ ) • Splunk ( http://www.splunk.com/en_us/homepage.html ) • …

Utilize Caching Compute Service API Service in-memory cache Message Bus

containers/006-demo-eventbus-logaggr

Utilize Caching Compute Service API Service in-memory cache Message Bus

containers/006-demo-eventbus-logaggr

Utilize Caching

What If My App Crashes?

Processes Die Accept it

Processes Die Accept it No system is %100 resilient. Every

crash is important. Every Exception is Important Too: Adopt a “Zero Exception Policy”

containers/007-demo-nodejs-as-a-service Processes Die Accept it

Keep It Running •forever ( https://github.com/foreverjs/forever ) •pm2 ( https://github.com/Unitech/pm2

) •upstart ( http://upstart.ubuntu.com/ ) •systemd ( https://www.wikiwand.com/en/Systemd )

Processes Die Accept it

Processes Die Accept it https://www.joyent.com/blog/mdb-and-node-js

Debugging

Live Debugging Given the Tornado, Where’s the Butterﬂy?

Post-Mortem Debugging

Node.JS Debugging Myths • Debugging and Proﬁling in Node.JS is

Hard • Debugging and Proﬁling in Node.JS is Immature • You Cannot Debug or Proﬁle a Live Production Node.JS App

Debugging • Live Debugging (using a REPL) • Remote Debugging

  (Node Inspector https://github.com/node-inspector/node-inspector,   WebStorm https://www.jetbrains.com/webstorm/,   Cloud9 IDE https://c9.io/) • Post-Mortem Debugging   (MDB: https://github.com/joyent/mdb_v8)

Debugging • Live Debugging (using a REPL) • Remote Debugging

Flame Graphs http://www.brendangregg.com/ﬂamegraphs.html http://github.com/brendangregg/FlameGraph

Flame Graphs & Core Dumps • Core Dumps • Can

Be Created When Node.JS Crashes ( --abort_on_uncaught_exception ) • Can Be Created at Runtime ( using gcore * ) • Flame Graphs • You Can Use dtrace + stackvis to generate them ** • You Can Use perf events + Flame Graphs Tool to generate them *** http://man7.org/linux/man-pages/man1/gcore.1.html * http://blog.nodejs.org/2012/04/25/proﬁling-node-js/ ** http://yunong.io/2015/11/23/generating-node-js-ﬂame-graphs/ ***

Debugging (Proﬁling) • Use Kernel Level Tools • DTrace (Solaris,

BSD), perf (Linux), and XPerf (Windows) • Can be used in production • Use the v8 Proﬁler • Not quite suitable for production

v8 Proﬁler

v8 Proﬁler • node --v8-options | grep gc — node

--v8-options | grep '\-\-trace' • `node --perf_basic_prof_only_functions .` => for perf events (new in Node 5) • `node --expose_gc --trace_gc --trace_gc_object_stats   --trace_gc_verbose --gc_global .` => traces to the console • `node --prof --log_timer_events --track_gc_object_stats  --log_internal-timer_events --no-use-inlining .` => creates a perf log ﬁle * See also: http://www.chromium.org/developers/creating-v8-proﬁling-timeline-plots

v8 Proﬁler

Debugging Demo containers/008-demo-watching-for-leaks

Help the Debugger • Always Name Your Functions • Don’t

let the errors go unhandled. • Emit “error” events instead of throwing exceptions. • Use an error library: • https://github.com/davepacheco/node-verror • Put a descriptive message before raising an error.

Help the Debugger • Always Name Your Functions • Don’t

let the errors go unhandled. • Emit “error” events instead of throwing exceptions. • Use an error library: • https://github.com/davepacheco/node-verror • Put a descriptive message before raising an error.

Use a Private NPM Log Aggregator Compute Service memory API

Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia

Use a Private NPM Log Aggregator Compute Service memory API

Use a Private NPM

Use a Private NPM ../../../../../../wtf ?!

Use a Private NPM

Use a Private NPM (preﬁx local modules)

Use a Private NPM • Promotes modularization and code re-use.

• Modules are cached, hence faster to install. • You can continue your work, even when public registry goes ofﬂine. • Makes refactoring and testing easier. • No more “../../../..”s!

Clustering :(){  :|:&  };:

Clustering containers/010-cluster * See http://docs.libuv.org/en/v1.x/threadpool.html https://nikhilm.github.io/uvbook/processes.html https://nikhilm.github.io/uvbook/threads.html for how the

dark magic works internally. * See also https://strongloop.com/strongblog/whats-new-in-node-js-v0-12-cluster-round-robin-load-balancing/ for how the load balancing between processes in the cluster module evolved over time;  and see https://github.com/nodejs/node-v0.x-archive/commit/e72cd41   for the Round-Robin cluster load balancing algorithm.

Clustering app memory app app app * See https://strongloop.com/strongblog/whats-new-in-node-js-v0-12-cluster-round-robin-load-balancing/ how

the load balancing between processes in the cluster module evolved over time;  and see https://github.com/nodejs/node-v0.x-archive/commit/e72cd41   for the Round-Robin cluster load balancing algorithm.

Is Bigger Always Better? ultra mega super box with bazillion

cores regular box

How Many Workers Per VM? two to four cores per

VM is an ideal balance m a s t e r child_process child_process child_process child_process

Is Bigger Always Better? OR you can use lightweight single-CPU

containers and a LB in lieu of clustering lightweight container lightweight container Load Balancer lightweight container lightweight container <-single core <-single core <-single core <-single core

Cluster The Services VM 2 Compute Service Compute Service cluster

API Service API Service cluster VM 1 Message Bus

Cluster

Cluster + Zero Downtime Rolling Deployments

Zero Downtime Rolling Deployments kill --USR2 <pid>

Zero Downtime Rolling Deployments

Circuit Breaker closed fail (under threshold) open fail (reached threshold)

checking… timer (exponential backoﬀ) fail success See http://www.amazon.com/gp/product/0978739213 and http://martinfowler.com/bliki/CircuitBreaker.html (503: Server Busy) (200: OK)

Circuit Breaker * This is a simpliﬁed example, and it

does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.

Circuit Breaker * This is a simpliﬁed example, and it

Circuit Breaker • Can be used with any kind of

metric. • You can use to “rate limit” your API. • Useful when you depend on other APIs that might fail.

Where Were We? VM 2 Compute Service Compute Service cluster

API Service API Service cluster VM 1 Message Bus

VM 2 Compute Service Compute Service cluster API Service API

Service cluster VM 1 Message Bus memory memory Are We Missing Something?

VM 2 Compute Service Compute Service cluster API Service API

Move the State Information Out VM 2 Compute Service Compute

Service cluster redis API Service API Service redis cluster VM 1 Message Bus containers/011-sharing-memory • Use redis to solve session afﬁnity. • Use token-based authentication with JWT to handle authentication   ( https://scotch.io/tutorials/the-ins-and-outs-of-token-based-authentication ).

Move the State Information Out VM 2 Compute Service Compute

Add a Load-Balancer Compute Service Compute Service cluster redis API

Service API Service redis cluster Compute Service Compute Service cluster API Service API Service cluster Load Balancer Message Bus containers/012-bounce

Add a Load-Balancer Compute Service Compute Service cluster redis API

Service API Service redis cluster Compute Service Compute Service cluster API Service API Service cluster Load Balancer Message Bus containers/012-bounce

Add AutoScale Rules autoscale groups Compute Service Compute Service cluster

redis API Service API Service redis cluster Compute Service Compute Service cluster API Service API Service cluster Load Balancer Message Bus

Load Balancing Options • Load Balancing as a Service (AWS,

Rackspace…) • Hardware Load Balancer (Cisco CEF, Barracuda, etc…) • Software Load Balancer • NGINX • HAProxy • home grown

Load Balancer

Wait! Aren’t These Actually Microservices? API app compute app worker

worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process API app compute app worker worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process broker load balancer Internet message bus redis redis … …

API app compute app worker worker worker child_process API app

c l u s t e r c l u s t e r compute app worker worker worker child_process API app compute app worker worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process broker load balancer Internet message bus redis redis … … API μ-Service Wait! Aren’t These Actually Microservices?

API app compute app worker worker worker child_process API app

c l u s t e r c l u s t e r compute app worker worker worker child_process API app compute app worker worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process broker load balancer Internet message bus redis redis … … API μ-Service Compute μ-Service Wait! Aren’t These Actually Microservices? * See Also: http://martinfowler.com/articles/microservice-trade-offs.html http://highscalability.com/blog/2014/4/8/microservices-not-a-free-lunch.html https://rclayton.silvrback.com/failing-at-microservices

That Means You’ve Become Famous Scalability Will Be the Least

of Your Concerns What If I Reach The Scalability Limits Within a Region?

Multiple Regions Region 1 Compute Service Compute Service redis API

Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Round-Robin DNS The Internet Message Bus Message Bus containers/013-round-robin

Multiple Regions Region 1 Compute Service Compute Service redis API

Round-Robin DNS containers/013-round-robin

Multiple Regions Load Balancer Load Balancer Region 1 Compute Service

Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus

Multiple Regions Load Balancer Load Balancer Region 1 Compute Service

You Can Add More Region 1 Compute AutoScale Group API

AutoScale Group LB LB DNS The Internet Message Bus Region 2 Compute AutoScale Group API AutoScale Group Message Bus … Region N Compute AutoScale Group API AutoScale Group Message Bus LB

How Do I Manage All This Infrastructure? This is Getting

Out of Hand! Region 1 Compute AutoScale Group API AutoScale Group LB LB DNS The Internet Message Bus Region 2 Compute AutoScale Group API AutoScale Group Message Bus … Region N Compute AutoScale Group API AutoScale Group Message Bus LB

Conﬁguration Management

• No Hard-Coded IP Addresses in Conﬁg Files • Let

DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Conﬁguration Management Tips

• No Hard-Coded IP Addresses in Conﬁg Files • Let

Test Your System as a Whole

CI / CD • Use a CI / CD Pipeline

• Show Love to Test-Driven Development • Don’t Forget Functional Tests and Integration Tests

Continuously Keep Your Code In Ship Shape • ESLint (

http://eslint.org ) • CodeClimate ( https://codeclimate.com/features ) • GreenKeeper ( http://greenkeeper.io ) • npm scripts (instead of Grunt or Gulp — YMMV)  ( https://docs.npmjs.com/misc/scripts ) • npm outdated ( https://docs.npmjs.com/cli/outdated ) • git pre-commit hooks ( https://github.com/observing/pre-commit ) • [ hint: Install your development dependencies (such as eslint, babel, gulp, etc) locally, (not globally)! ]

Are We Done Yet? Load Balancer Load Balancer Region 1

Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus

Are We Done Yet? Load Balancer Load Balancer Region 1

Making the Load Balancer HA * see also: https://www.wikiwand.com/en/Virtual_Router_Redundancy_Protocol Load

Balancer client Load Balancer Load Balancer client keepalived active failover

Making the Load Balancer HA * see also: https://www.wikiwand.com/en/Virtual_Router_Redundancy_Protocol Load

Balancer client Load Balancer Load Balancer client keepalived active failover

Making the Load Balancer Highly Available • round-robin DNS •

https://www.wikiwand.com/en/Round-robin_DNS • heartbeat • https://www.wikiwand.com/en/Heartbeat_(computing) • keepalived • http://keepalived.org/ * You can use these tools to make any component HA.

SSL Termination * * * https://github.com/bumptech/stud Load Balancer Load Balancer

client keepalived active failover SSL Terminator SSL Terminator client keepalived active failover Load Balancer Load Balancer

SSL Termination * * * https://github.com/bumptech/stud Load Balancer Load Balancer

Make Redis and RabbitMQ Redundant redis redis (master) redis (read

replica) redis (read replica) redis (read replica) redis (master) redis (read replica) redis (read replica) redis (read replica) round-robin DNS This will also increase throughput as a side beneﬁt. See http://redis.io/topics/replication and http://redis.io/topics/ cluster-tutorial. You can also use a managed “memory as a service” solution. See also https://www.rabbitmq.com/ha.html for how a similar queue mirroring is implemented for a RabbitMQ cluster.   And similarly, you can use a managed “queue as a service” solution to ease your pain ;)

Build Redundancy Everywhere

Build Redundancy Everywhere Note This is more typically done by

using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.

Build Redundancy Everywhere Note This is more typically done by

Torture Your System • Try Chaos Monkey • https://github.com/Netﬂix/SimianArmy/wiki/Chaos-Monkey •

Randomly send `kill -9` to Processes • Randomly Knock a Server Ofﬂine • Intentionally Run Out of Disk Space • Take an entire data center down

Summary Σ

Summary Stateless is Better than Stateful Eventual Consistency Build Redundancy

Everywhere! Startup Fast, Shut Down Gracefully Solve Problems That Actually Exist

Summary Never Assume, Always Measure Perf Before Scale Infrastructure is

Code; Automate It! Keep Conﬁguration Details in Environment Variables Show Love to DNS

Summary • Know Your Ecosystem • Know Your Tools •

Use Tools, not Rules!

Scale 2 ∞ & 㱺 Region 1 Compute AutoScale Group

API AutoScale Group LB LB DNS The Internet Message Bus Region 2 Compute AutoScale Group API AutoScale Group Message Bus … Region N Compute AutoScale Group API AutoScale Group Message Bus LB API Service Internet Bastion Simulated by an NGINX static web server * Fetch HTML oﬀ of websites * Simplify and convert the HTML to plain text * Do NLP/Tokenization on the plain text * Create tags as a result test API

Scaling Your Node.JS API Like a Boss

Scaling Your Node.JS API Like a Boss

More Decks by volkan

Other Decks in Technology

Featured

Transcript