Scaling Your Node.JS API Like a Boss

0368b95a18e594981083b2eb9b177b2d?s=47 volkan
March 08, 2016

Scaling Your Node.JS API Like a Boss

________

Video of the Presentation Is Available Here »»
https://www.youtube.com/watch?v=Ogjb60Fg10A
________

It’s one thing to create a sample RESTful API using Node.js (maybe utilizing the cluster module to distribute the load), but it’s quite another to horizontally scale your architecture to hundreds of thousands of concurrent connections while trying to ensure redundancy and high availability. Knowing how to scale is important, but more important than that is knowing when to scale.

Volkan Özçelik explores what it takes to create a real-life, scalable, highly available, and highly responsive Node.js application. Volkan will also explain how to store the application state in its own cluster and why it matters.

Volkan outlines how to choose the container architecture for your (virtual) machines, how you can roll out updates to service without disrupting the users, and how you fail gracefully when things on a node go haywire. He also covers tracking down memory leaks and coming up with short-term (i.e., restarting your nodes when they become too beefy) and long-term (i.e., actually spotting where the leaks are and fixing them) solutions to address them.

When you dive deeper and deeper into the rabbit hole, you soon realize that scalability is a tough job that requires careful planning and consideration. The bottom line is that designing any system to scale is a never-ending adventure, and there is no limit on how deep you can dive.

0368b95a18e594981083b2eb9b177b2d?s=128

volkan

March 08, 2016
Tweet

Transcript

  1. NODE JS API your …like a boss Scaling http://bit.ly/nodejs-rocks Volkan

    Özçelik March, 7, 2016 http://volkan.io/ @linkibol v0lkan
  2. API Scalability (in theory)

  3. API Scalability (in practice) nothing is linear $#!% will eventually

    happen!
  4. About Me • Volkan Özçelik — JavaScript Lover & Performance

    Freak • Current: • Technical Lead @ Cisco • Before: • Mobile Frontend Engineer @ Jive Software • VP of Technology @ grou.ps (now GymGroups) • CTO @ cember.net (acquired by Xing )
 • Chase Me:
 @linkibol
 
 v0lkan
  5. volkan.io Slides & Source Code

  6. Agenda • Node’s Strengths and Weaknesses • Tweaking Our OS

    • Throughput, Concurrency, Latency • Scale a Real-Life Node App
  7. How do I Architect 
 a Scalable and Consistent Node.JS

    API 
 with Manageable Complexity? In a Nutshell…
  8. How do I Architect 
 a Scalable and Consistent Node.JS

    API 
 with Manageable Complexity? In a Nutshell…
  9. How do I Architect 
 a Scalable and Consistent Node.JS

    API 
 with Manageable Complexity? In a Nutshell…
  10. Don’t Fight Windmills • Keep things simpler. • Build something

    that’s good enough for your purpose. • Solve for the problems that are actually on your plate.
  11. • Monitor All The Things • Collect Metrics • Form

    a Hypothesis • Gather Evidence • Validate Your Hypothesis • Take Corrective Action If Needed Don’t Invent Problems That You Don’t Have (Yet)
  12. Goals • Minimize Client Response Time • Maximize Resource Efficiency

    on the Server
 Hint: Leave 50% of the memory unused
 (for taking core dumps)
  13. High-Level Topology of an API Service API Service Load Balancer

    SSL Termination Load Balancing API Gateway Authentication Authorization Token Exchange Rate Limiting … HTTP Proxy Clients
  14. High-Level Topology of an API Service API Service Load Balancer

    SSL Termination Load Balancing API Gateway Authentication Authorization Token Exchange Rate Limiting … HTTP Proxy Clients
  15. So… JavaScript? JavaScript

  16. Show Love to Functions • Accept JavaScript’s functional and composable

    nature. • Avoid `this` and avoid `new` — You’ll thank me later. • Create Focused, Independent, Reusable, and Testable Modules.
  17. “OO leads to anger; Anger leads to hate; Hate leads

    to suffering!” Embrace the Difference
  18. Know Your Platform

  19. Node.JS Is Perfect For… • IO-Heavy Applications • Data-Intensive Realtime

    Apps • RESTful / API-Driven (Micro)services • Streams • Queued (Lazy) Writes • Processing data on-the-fly https://github.com/libuv/libuv
  20. Node.JS Is not For… • Serving Static Files • CPU-bound

    Applications • Creating a Monolithic Infrastructure
  21. Node.JS is not a Swiss Army Knife • Load Balancing

    ➡ haproxy | NGINX | ELB 
 ( http://www.haproxy.org/ | http://nginx.org/ | http://aws.amazon.com ) • SSL Termination ➡ stud ( https://github.com/bumptech/stud ) • GZIP Compression ➡ NGINX | haproxy • Serving Static Assets ➡ CDN | NGINX | Varnish ( https://www.varnish-cache.org/ )
  22. Know Your Bottlenecks • Node.JS serves really well as a

    highly concurrent networking app. • Node.JS is very sensitive to memory leaks and blocking code. • 99% of the time you will be IO-bound.
  23. Know the Ecosystem • Do Not Ignore The Ecosystem •

    Follow Community News and Updates • Attend to Conferences (like this one) • Know Your Tools and Use Them
  24. Tweaking the OS

  25. Open File Limits "Error: EMFILE, Too many open files" ulimit

    -n 60000
  26. Open File Limits "Error: EMFILE, Too many open files" ulimit

    -n 60000
  27. Open File Limits "Error: EMFILE, Too many open files" ulimit

    -n 60000
  28. Configuring the Load Balancer * See https://bit.ly/nginx-rocks

  29. Configuring the Load Balancer * See https://bit.ly/nginx-rocks

  30. Configuring the Load Balancer * See https://bit.ly/nginx-rocks

  31. Additional Tweaks sysctl -w net.core.somaxconn=1024; (default is 128)

  32. Even More Tweaks Do NOT alter anything that you don’t

    know! * See 
 https://www.frozentux.net/ipsysctl-tutorial/ipsysctl-tutorial.html and http://www.tldp.org/LDP/solrhe/Securing-Optimizing-Linux-RH-Edition-v1.3/ 
 for more info.
  33. Security

  34. Common Threats • XSS / CSRF • Input Validation Attack

    • DoS / ReDoS • Request Size * Securing Node.JS is not different from securing any other web app. See also: https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project
  35. Security • https://www.owasp.org/index.php/ OWASP_Node_js_Goat_Project • http://nodesecurity.io/ • https://www.owasp.org/index.php/ OWASP_Zed_Attack_Proxy_Project

  36. Do Not Run Node.JS As Root useradd -mrU web
 mkdir

    /opt/web-app
 chown web /opt/web-app
 cd /opt/web-app
 su web
 node app.js
 firewall-cmd --permanent --zone=public --add-port=3000/tcp Also, always run Node.JS behind a reverse proxy!
  37. Let’s Get Our Hands Dirty

  38. restify http tcp containers/000-simple-app-restify containers/001-simple-app-http containers/002-simple-app-tcp

  39. restify http tcp ab -n 10000 -c 100 http://app:8000/hello containers/000-simple-app-restify

    containers/001-simple-app-http containers/002-simple-app-tcp
  40. ab -n 10000 -c 100 http://app:8000/hello Tested on MacBook Pro,

    2.4 GHz Intel Core i5, 16 GB 1600 MHz DDR3 Going Bare Bones
  41. Is It Worth It? • You Can Go Bare-Bones for

    Maximum Throughput • Tradeoff: • Harder to maintain • More complex code • Error prone • Lots of edge cases • Harder to use additional tooling
  42. Concurrency

  43. Throughput vs Concurrency

  44. Throughput vs Concurrency

  45. Throughput vs Concurrency linear increase

  46. Throughput vs Concurrency linear increase slowdown

  47. Throughput vs Concurrency linear increase slowdown almost constant

  48. Throughput vs Concurrency linear increase slowdown almost constant rapid decline

  49. Throughput vs Concurrency linear increase slowdown almost constant rapid decline

  50. Throughput vs Concurrency

  51. Throughput vs Concurrency

  52. Throughput vs Concurrency

  53. Distributed Load Testing Toolbox • Apps • jMeter: http://jmeter.apache.org •

    Gatling: http://gatling.io/#/ • The Grinder: 
 http://grinder.sourceforge.net • Locust: 
 https://github.com/locustio/locust • “as a service” • flood.io: https://flood.io • loader.io: http://loader.io • LoadImpact: https://loadimpact.com • BlazeMeter: https://www.blazemeter.com • LoadStorm: http://loadstorm.com
  54. Latency

  55. Latency

  56. Latency and Throughput

  57. Latency and Throughput

  58. Lessons Learned

  59. Lessons Learned • Latency Kills • Know Your Platform &

    Know Your Tools • For maximum throughput go bare bones • Tradeoff: Giving up all the benefits a framework has to offer • Low-level code is harder to maintain: • Harder to Test and Verify / Easier to Create Bugs and Regressions • Corollary: As you add additional layers of abstractions, your API will marginally slow down. • The Inception Rule: More than three levels and you’re lost forever!
  60. Perf Before Scale

  61. Perf Advice Is Addictive Optimization without measurement is futile.

  62. Perf Before Scale • Rule #1: 
 Avoid premature optimization.


    Do measurements, and optimize what matters. • Tweak Your System for High Performance • Cache All The Things • Cache at every level. • The fastest API response is no response at all. • Delegate Long-Running/CPU-Intensive(*) Operations • Be Lazy Whenever Possible
  63. You can’t optim ize w hat you don’t m easure.

    Optimization
  64. Things to Watch Out For • Always Keep an Eye

    on the Event Loop • Your API Service may Become CPU-Bound • External API Calls Can Be a Bottleneck • Track Heap Usage Over Time • Implement Sanity Checks • Implement Circuit Breakers • Have an Upper Bound for Concurrency
  65. Things to Watch Out For • Is the app running

    and functional? • Is the app overloaded? • How many errors have been raised so far? • Is the app performant (throughput, memory utilization, concurrency)? • Is my cluster healthy? • How many times do forks restored? • Are all clustered forks alive and okay?
  66. Which Will (most of the time) Boil Down to… •

    Watching Response Times • Watching CPU Utilization + General Sys Resource Usage • Watching Number of Concurrent Connections
  67. v8 Optimizations

  68. Types of Compilers in v8 • Generic Compiler • Optimizing

    Compiler (Crankshaft) • Can Be Two or More Orders of Magnitude Faster See also: * https://wingolog.org/archives/2011/07/05/v8-a-tale-of-two-compilers * http://thibaultlaurens.github.io/javascript/2013/04/29/how-the-v8-engine-works/ * http://www.html5rocks.com/en/tutorials/speed/v8/
  69. X-Ray View Into the v8 Compiler node --trace_opt 
 --trace_deopt

    
 --allow-natives-syntax test.js; • console.log(%HasFastProperties(obj)) • console.log(%GetOptimizationStatus(fn)) https://github.com/Nathanaela/v8-natives
  70. X-Ray View Into the v8 Compiler node --trace_opt 
 --trace_deopt

    
 --allow-natives-syntax test.js; • console.log(%HasFastProperties(obj)) • console.log(%GetOptimizationStatus(fn)) https://github.com/Nathanaela/v8-natives
  71. Optimize Hot Code Paths Only (Unless You Have a Solid

    Evidence to Do Otherwise)
  72. v8 Optimization Killers • Using debugger anywhere within the function.

    • Using eval anywhere within the function. • Using with anywhere within the function. • Using try/catch anywhere within the function. * ~via https://github.com/petkaantonov/bluebird/wiki/Optimization-killers
  73. Typical Example: try/catch Inside Function

  74. Isolate try/catch

  75. Isolate try/catch

  76. Isolate try/catch

  77. Isolate try/catch

  78. Perform Lazy Evaluations

  79. Perform Lazy Evaluations

  80. Perform Lazy Evaluations <=Sync

  81. Perform Lazy Evaluations <=Sync

  82. Perform Lazy Evaluations <=Async <=Sync

  83. Perform Lazy Evaluations <=Async <=Sync

  84. Perform Lazy Evaluations <=Async <=Async <=Sync

  85. Perform Lazy Evaluations <=Async <=Async <=Sync

  86. Let’s Create Something Real

  87. Let’s Create Something Real • An API that… • Auto-suggest

    tags, given a url • Lists related URLs, given a tag
  88. containers/003-the-real-deal

  89. containers/003-the-real-deal

  90. containers/003-the-real-deal

  91. containers/003-the-real-deal

  92. API Service Internet Bastion Simulated by an NGINX static web

    server * Fetch HTML off of websites * Simplify and convert the HTML to plain text * Do NLP/Tokenization on the plain text * Create tags as a result test API Initial Topology
  93. Let’s Test How Our API Performs

  94. Findings • get-tags appear to be CPU-bound. • When get-tags

    is being requested, the performance of get-urls becomes two orders of magnitude slower. • get-urls appears to be pretty fast, and it is not CPU bound.
  95. How Can We Be Sure? • Add probes (DTrace, XTrace…

    etc) 
 to trace what’s happening. • Create a REPL to check the app at runtime. containers/004-demo-w-instrumentation
  96. Creating a REPL • You Can Expose Internal State via

    an API and/or a CLI/REPL • vantage: https://github.com/dthree/vantage • kang: https://github.com/davepacheco/kang • repl server: https://nodejs.org/api/repl.html • Expose Additional Logging Info at Runtime (in systems that support it) • bunyan -p ( https://github.com/trentm/node-bunyan )
  97. The REPL

  98. The REPL

  99. The REPL

  100. Adding Probes

  101. Adding Probes * See https://github.com/v0lkan/kiraz App Node

  102. Adding Probes * See https://github.com/v0lkan/kiraz App Node

  103. Adding Probes * See https://github.com/v0lkan/kiraz App Node

  104. Adding Probes Bastion Host

  105. Adding Probes Bastion Host

  106. Adding Probes Bastion Host

  107. Findings (get-tags)

  108. Findings (get-tags)

  109. Findings (get-tags)

  110. Monitoring Toolbox • Runtime Performance Probing (Kernel-Level Tools) • Linux

    Perf Events ( https://perf.wiki.kernel.org/index.php/Main_Page ) perf record -F 71 -p `pgrep -n node` -g -- sleep 30 node --perf_basic_prof_only_functions • Dtrace ( http://dtrace.org/blogs/about/ ) • Tracking Transactions and Tracing Latency • Zipkin ( https://github.com/openzipkin/zipkin ) • Runtime Memory Usage (heap stats, heap diffing, leak detection) • Memwatch ( https://github.com/lloyd/node-memwatch ) • See http://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection for details.
  111. Monitoring Toolbox • Runtime Performance Probing (Kernel-Level Tools) • Linux

    Perf Events ( https://perf.wiki.kernel.org/index.php/Main_Page ) perf record -F 71 -p `pgrep -n node` -g -- sleep 30 node --perf_basic_prof_only_functions • Dtrace ( http://dtrace.org/blogs/about/ ) • Tracking Transactions and Tracing Latency • Zipkin ( https://github.com/openzipkin/zipkin ) • Runtime Memory Usage (heap stats, heap diffing, leak detection) • Memwatch ( https://github.com/lloyd/node-memwatch ) • See http://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection for details.
  112. Monitoring Toolbox • Runtime Performance Probing (Kernel-Level Tools) • Linux

    Perf Events ( https://perf.wiki.kernel.org/index.php/Main_Page ) perf record -F 71 -p `pgrep -n node` -g -- sleep 30 node --perf_basic_prof_only_functions • Dtrace ( http://dtrace.org/blogs/about/ ) • Tracking Transactions and Tracing Latency • Zipkin ( https://github.com/openzipkin/zipkin ) • Runtime Memory Usage (heap stats, heap diffing, leak detection) • Memwatch ( https://github.com/lloyd/node-memwatch ) • See http://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection for details.
  113. Monitoring Toolbox • Monitoring “as a service” • nodetime https://nodetime.com/

    • newrelic http://newrelic.com/nodejs • strongloop https://strongloop.com/node-js/performance-monitoring/ • keymetrics https://keymetrics.io/ • appdynamics https://www.appdynamics.com/nodejs/ • …
  114. So… Something Is CPU-Intensive • get-urls is CPU-bound and it

    also blocks the event loop • What can we do? • Split computationally heavy parts and 
 fork as child processes and use external libraries. • Create a native Node.JS extension 
 that does not block the event loop. • Refactor the compute logic into a separate service first.
  115. So… Something Is CPU-Intensive • get-urls is CPU-bound and it

    also blocks the event loop • What can we do? • Split computationally heavy parts and 
 fork as child processes and use external libraries. • Create a native Node.JS extension 
 that does not block the event loop. • Refactor the compute logic into a separate service first. app memory worker worker worker child_process
  116. So… Something Is CPU-Intensive • get-urls is CPU-bound and it

    also blocks the event loop • What can we do? • Split computationally heavy parts and 
 fork as child processes and use external libraries. • Create a native Node.JS extension 
 that does not block the event loop. • Refactor the compute logic into a separate service first. app memory worker worker worker child_process
  117. So… Something Is CPU-Intensive • get-urls is CPU-bound and it

    also blocks the event loop • What can we do? • Split computationally heavy parts and 
 fork as child processes and use external libraries. • Create a native Node.JS extension 
 that does not block the event loop. • Refactor the compute logic into a separate service first. app memory worker worker worker child_process
  118. So… Something Is CPU-Intensive • get-urls is CPU-bound and it

    also blocks the event loop • What can we do? • Split computationally heavy parts and 
 fork as child processes and use external libraries. • Create a native Node.JS extension 
 that does not block the event loop. • Refactor the compute logic into a separate service first. app memory worker worker worker child_process
  119. So… Something Is CPU-Intensive • get-urls is CPU-bound and it

    also blocks the event loop • What can we do? • Split computationally heavy parts and 
 fork as child processes and use external libraries. • Create a native Node.JS extension 
 that does not block the event loop. • Refactor the compute logic into a separate service first. app memory worker worker worker child_process
  120. Split App and Compute Nodes Compute Service API Service Message

    Bus rabbitmq, zeromq, resque etc. see also http://queues.io/ * * containers/005-demo-split-compute
  121. Message Bus Topologies P C Send/Listen P C1 C2 Worker

    Queue X C1 PubSub P C2
  122. Message Bus Topologies P C Send/Listen P C1 C2 Worker

    Queue X C1 PubSub P C2
  123. Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

    Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue
  124. Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

    Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue
  125. Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

    Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue
  126. Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

    Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue
  127. Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

    Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue
  128. Split App and Compute Nodes (Message Bus) RabbitMQ Request Queue

    Response Queue API Service Compute Service API Service API Service … … Compute Service one response queue per service round-robin dispatch Compute Service Response Queue Response Queue
  129. Log Aggregation “G ood developers debug. G reat developers read

    logs.”
  130. Aggregate and Rotate Your Log Files Log Aggregator Compute Service

    memory API Service memory Message Bus containers/006-demo-eventbus-logaggr
  131. Aggregate and Rotate Your Log Files Log Aggregator Compute Service

    memory API Service memory Message Bus containers/006-demo-eventbus-logaggr
  132. Aggregate and Rotate Your Log Files Log Aggregator Compute Service

    memory API Service memory Message Bus containers/006-demo-eventbus-logaggr
  133. Use a Decent Logger • Bunyan ( https://github.com/trentm/node-bunyan ) •

    Winston ( https://github.com/winstonjs/winston ) • Log4JS ( https://github.com/nomiddlename/log4js-node )
  134. What to Log • Authentication & Authorization • Session Management

    • Method Entry Points • Errors and Weird Events • Specific Events (startup, shutdown, slowdown etc.) • High-Risk Functionalities (payments, privileges, admins etc)
  135. Log Analysis Toolbox • Loggly ( https://www.loggly.com/ ) • ELK

    Stack ( https://www.elastic.co/products ) • Nagios Log Server ( https://www.nagios.com/products/nagios-log-server/ ) • Splunk ( http://www.splunk.com/en_us/homepage.html ) • …
  136. Utilize Caching Compute Service API Service in-memory cache Message Bus

    containers/006-demo-eventbus-logaggr
  137. Utilize Caching Compute Service API Service in-memory cache Message Bus

    containers/006-demo-eventbus-logaggr
  138. Utilize Caching

  139. Utilize Caching

  140. Utilize Caching

  141. What If My App Crashes?

  142. Processes Die Accept it

  143. Processes Die Accept it No system is %100 resilient. Every

    crash is important. Every Exception is Important Too: Adopt a “Zero Exception Policy”
  144. containers/007-demo-nodejs-as-a-service Processes Die Accept it

  145. Keep It Running •forever ( https://github.com/foreverjs/forever ) •pm2 ( https://github.com/Unitech/pm2

    ) •upstart ( http://upstart.ubuntu.com/ ) •systemd ( https://www.wikiwand.com/en/Systemd )
  146. Processes Die Accept it

  147. Processes Die Accept it

  148. Processes Die Accept it

  149. Processes Die Accept it https://www.joyent.com/blog/mdb-and-node-js

  150. Debugging

  151. Live Debugging Given the Tornado, Where’s the Butterfly?

  152. Post-Mortem Debugging

  153. Node.JS Debugging Myths • Debugging and Profiling in Node.JS is

    Hard • Debugging and Profiling in Node.JS is Immature • You Cannot Debug or Profile a Live Production Node.JS App
  154. Debugging • Live Debugging (using a REPL) • Remote Debugging

    
 (Node Inspector https://github.com/node-inspector/node-inspector, 
 WebStorm https://www.jetbrains.com/webstorm/, 
 Cloud9 IDE https://c9.io/) • Post-Mortem Debugging 
 (MDB: https://github.com/joyent/mdb_v8)
  155. Debugging • Live Debugging (using a REPL) • Remote Debugging

    
 (Node Inspector https://github.com/node-inspector/node-inspector, 
 WebStorm https://www.jetbrains.com/webstorm/, 
 Cloud9 IDE https://c9.io/) • Post-Mortem Debugging 
 (MDB: https://github.com/joyent/mdb_v8)
  156. Debugging • Live Debugging (using a REPL) • Remote Debugging

    
 (Node Inspector https://github.com/node-inspector/node-inspector, 
 WebStorm https://www.jetbrains.com/webstorm/, 
 Cloud9 IDE https://c9.io/) • Post-Mortem Debugging 
 (MDB: https://github.com/joyent/mdb_v8)
  157. Debugging • Live Debugging (using a REPL) • Remote Debugging

    
 (Node Inspector https://github.com/node-inspector/node-inspector, 
 WebStorm https://www.jetbrains.com/webstorm/, 
 Cloud9 IDE https://c9.io/) • Post-Mortem Debugging 
 (MDB: https://github.com/joyent/mdb_v8)
  158. Debugging • Live Debugging (using a REPL) • Remote Debugging

    
 (Node Inspector https://github.com/node-inspector/node-inspector, 
 WebStorm https://www.jetbrains.com/webstorm/, 
 Cloud9 IDE https://c9.io/) • Post-Mortem Debugging 
 (MDB: https://github.com/joyent/mdb_v8)
  159. Flame Graphs http://www.brendangregg.com/flamegraphs.html http://github.com/brendangregg/FlameGraph

  160. Flame Graphs & Core Dumps • Core Dumps • Can

    Be Created When Node.JS Crashes ( --abort_on_uncaught_exception ) • Can Be Created at Runtime ( using gcore * ) • Flame Graphs • You Can Use dtrace + stackvis to generate them ** • You Can Use perf events + Flame Graphs Tool to generate them *** http://man7.org/linux/man-pages/man1/gcore.1.html * http://blog.nodejs.org/2012/04/25/profiling-node-js/ ** http://yunong.io/2015/11/23/generating-node-js-flame-graphs/ ***
  161. Debugging (Profiling) • Use Kernel Level Tools • DTrace (Solaris,

    BSD), perf (Linux), and XPerf (Windows) • Can be used in production • Use the v8 Profiler • Not quite suitable for production
  162. v8 Profiler

  163. v8 Profiler • node --v8-options | grep gc — node

    --v8-options | grep '\-\-trace' • `node --perf_basic_prof_only_functions .` => for perf events (new in Node 5) • `node --expose_gc --trace_gc --trace_gc_object_stats 
 --trace_gc_verbose --gc_global .` => traces to the console • `node --prof --log_timer_events --track_gc_object_stats
 --log_internal-timer_events --no-use-inlining .` => creates a perf log file * See also: http://www.chromium.org/developers/creating-v8-profiling-timeline-plots
  164. v8 Profiler

  165. v8 Profiler

  166. v8 Profiler

  167. v8 Profiler

  168. Debugging Demo containers/008-demo-watching-for-leaks

  169. Help the Debugger • Always Name Your Functions • Don’t

    let the errors go unhandled. • Emit “error” events instead of throwing exceptions. • Use an error library: • https://github.com/davepacheco/node-verror • Put a descriptive message before raising an error.
  170. Help the Debugger • Always Name Your Functions • Don’t

    let the errors go unhandled. • Emit “error” events instead of throwing exceptions. • Use an error library: • https://github.com/davepacheco/node-verror • Put a descriptive message before raising an error.
  171. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  172. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  173. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  174. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  175. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  176. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  177. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  178. Use a Private NPM Log Aggregator Compute Service memory API

    Service memory Message Bus Private NPM Public NPM cache / mirror containers/009-demo-setting-up-private-npm (sinopia) https://github.com/rlidwka/sinopia
  179. Use a Private NPM

  180. Use a Private NPM

  181. Use a Private NPM

  182. Use a Private NPM

  183. Use a Private NPM ../../../../../../wtf ?!

  184. Use a Private NPM

  185. Use a Private NPM

  186. Use a Private NPM

  187. Use a Private NPM

  188. Use a Private NPM (prefix local modules)

  189. Use a Private NPM (prefix local modules)

  190. Use a Private NPM • Promotes modularization and code re-use.

    • Modules are cached, hence faster to install. • You can continue your work, even when public registry goes offline. • Makes refactoring and testing easier. • No more “../../../..”s!
  191. Clustering :(){
 :|:&
 };:

  192. Clustering containers/010-cluster * See http://docs.libuv.org/en/v1.x/threadpool.html https://nikhilm.github.io/uvbook/processes.html https://nikhilm.github.io/uvbook/threads.html for how the

    dark magic works internally. * See also https://strongloop.com/strongblog/whats-new-in-node-js-v0-12-cluster-round-robin-load-balancing/ for how the load balancing between processes in the cluster module evolved over time;
 and see https://github.com/nodejs/node-v0.x-archive/commit/e72cd41 
 for the Round-Robin cluster load balancing algorithm.
  193. Clustering app memory app app app * See https://strongloop.com/strongblog/whats-new-in-node-js-v0-12-cluster-round-robin-load-balancing/ how

    the load balancing between processes in the cluster module evolved over time;
 and see https://github.com/nodejs/node-v0.x-archive/commit/e72cd41 
 for the Round-Robin cluster load balancing algorithm.
  194. Is Bigger Always Better? ultra mega super box with bazillion

    cores regular box
  195. How Many Workers Per VM? two to four cores per

    VM is an ideal balance m a s t e r child_process child_process child_process child_process
  196. Is Bigger Always Better? OR you can use lightweight single-CPU

    containers and a LB in lieu of clustering lightweight container lightweight container Load Balancer lightweight container lightweight container <-single core <-single core <-single core <-single core
  197. Cluster The Services VM 2 Compute Service Compute Service cluster

    API Service API Service cluster VM 1 Message Bus
  198. Cluster

  199. Cluster

  200. Cluster

  201. Cluster + Zero Downtime Rolling Deployments

  202. Zero Downtime Rolling Deployments kill --USR2 <pid>

  203. Zero Downtime Rolling Deployments

  204. Zero Downtime Rolling Deployments

  205. Zero Downtime Rolling Deployments

  206. Zero Downtime Rolling Deployments

  207. Zero Downtime Rolling Deployments

  208. Zero Downtime Rolling Deployments

  209. Circuit Breaker closed fail (under threshold) open fail (reached threshold)

    checking… timer (exponential backoff) fail success See http://www.amazon.com/gp/product/0978739213 and http://martinfowler.com/bliki/CircuitBreaker.html (503: Server Busy) (200: OK)
  210. Circuit Breaker * This is a simplified example, and it

    does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.
  211. Circuit Breaker * This is a simplified example, and it

    does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.
  212. Circuit Breaker * This is a simplified example, and it

    does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.
  213. Circuit Breaker * This is a simplified example, and it

    does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.
  214. Circuit Breaker * This is a simplified example, and it

    does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.
  215. Circuit Breaker * This is a simplified example, and it

    does not strictly follow circuit-breaker state transitions. See:https://github.com/yammer/circuit-breaker-js and https://github.com/mweagle/circuit-breaker for more canonical implementations. local-modules/local-fluent-circut * You can use https://github.com/lloyd/node-toobusy for checking event loop delay.
  216. Circuit Breaker • Can be used with any kind of

    metric. • You can use to “rate limit” your API. • Useful when you depend on other APIs that might fail.
  217. Where Were We? VM 2 Compute Service Compute Service cluster

    API Service API Service cluster VM 1 Message Bus
  218. VM 2 Compute Service Compute Service cluster API Service API

    Service cluster VM 1 Message Bus memory memory Are We Missing Something?
  219. VM 2 Compute Service Compute Service cluster API Service API

    Service cluster VM 1 Message Bus memory memory Are We Missing Something?
  220. VM 2 Compute Service Compute Service cluster API Service API

    Service cluster VM 1 Message Bus memory memory Are We Missing Something?
  221. Move the State Information Out VM 2 Compute Service Compute

    Service cluster redis API Service API Service redis cluster VM 1 Message Bus containers/011-sharing-memory • Use redis to solve session affinity. • Use token-based authentication with JWT to handle authentication 
 ( https://scotch.io/tutorials/the-ins-and-outs-of-token-based-authentication ).
  222. Move the State Information Out VM 2 Compute Service Compute

    Service cluster redis API Service API Service redis cluster VM 1 Message Bus containers/011-sharing-memory • Use redis to solve session affinity. • Use token-based authentication with JWT to handle authentication 
 ( https://scotch.io/tutorials/the-ins-and-outs-of-token-based-authentication ).
  223. Move the State Information Out VM 2 Compute Service Compute

    Service cluster redis API Service API Service redis cluster VM 1 Message Bus containers/011-sharing-memory • Use redis to solve session affinity. • Use token-based authentication with JWT to handle authentication 
 ( https://scotch.io/tutorials/the-ins-and-outs-of-token-based-authentication ).
  224. Add a Load-Balancer Compute Service Compute Service cluster redis API

    Service API Service redis cluster Compute Service Compute Service cluster API Service API Service cluster Load Balancer Message Bus containers/012-bounce
  225. Add a Load-Balancer Compute Service Compute Service cluster redis API

    Service API Service redis cluster Compute Service Compute Service cluster API Service API Service cluster Load Balancer Message Bus containers/012-bounce
  226. Add AutoScale Rules autoscale groups Compute Service Compute Service cluster

    redis API Service API Service redis cluster Compute Service Compute Service cluster API Service API Service cluster Load Balancer Message Bus
  227. Load Balancing Options • Load Balancing as a Service (AWS,

    Rackspace…) • Hardware Load Balancer (Cisco CEF, Barracuda, etc…) • Software Load Balancer • NGINX • HAProxy • home grown
  228. Load Balancer

  229. Load Balancer

  230. Load Balancer

  231. Load Balancer

  232. Load Balancer

  233. Wait! Aren’t These Actually Microservices? API app compute app worker

    worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process API app compute app worker worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process broker load balancer Internet message bus redis redis … …
  234. API app compute app worker worker worker child_process API app

    c l u s t e r c l u s t e r compute app worker worker worker child_process API app compute app worker worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process broker load balancer Internet message bus redis redis … … API μ-Service Wait! Aren’t These Actually Microservices?
  235. API app compute app worker worker worker child_process API app

    c l u s t e r c l u s t e r compute app worker worker worker child_process API app compute app worker worker worker child_process API app c l u s t e r c l u s t e r compute app worker worker worker child_process broker load balancer Internet message bus redis redis … … API μ-Service Compute μ-Service Wait! Aren’t These Actually Microservices? * See Also: http://martinfowler.com/articles/microservice-trade-offs.html http://highscalability.com/blog/2014/4/8/microservices-not-a-free-lunch.html https://rclayton.silvrback.com/failing-at-microservices
  236. That Means You’ve Become Famous Scalability Will Be the Least

    of Your Concerns What If I Reach The Scalability Limits Within a Region?
  237. Multiple Regions Region 1 Compute Service Compute Service redis API

    Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Round-Robin DNS The Internet Message Bus Message Bus containers/013-round-robin
  238. Multiple Regions Region 1 Compute Service Compute Service redis API

    Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Round-Robin DNS The Internet Message Bus Message Bus containers/013-round-robin
  239. Multiple Regions Region 1 Compute Service Compute Service redis API

    Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Round-Robin DNS The Internet Message Bus Message Bus containers/013-round-robin
  240. Multiple Regions Region 1 Compute Service Compute Service redis API

    Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Round-Robin DNS The Internet Message Bus Message Bus containers/013-round-robin
  241. Multiple Regions Region 1 Compute Service Compute Service redis API

    Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Round-Robin DNS The Internet Message Bus Message Bus containers/013-round-robin
  242. Round-Robin DNS containers/013-round-robin

  243. Round-Robin DNS containers/013-round-robin

  244. Multiple Regions Load Balancer Load Balancer Region 1 Compute Service

    Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  245. Multiple Regions Load Balancer Load Balancer Region 1 Compute Service

    Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  246. Multiple Regions Load Balancer Load Balancer Region 1 Compute Service

    Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  247. You Can Add More Region 1 Compute AutoScale Group API

    AutoScale Group LB LB DNS The Internet Message Bus Region 2 Compute AutoScale Group API AutoScale Group Message Bus … Region N Compute AutoScale Group API AutoScale Group Message Bus LB
  248. How Do I Manage All This Infrastructure? This is Getting

    Out of Hand! Region 1 Compute AutoScale Group API AutoScale Group LB LB DNS The Internet Message Bus Region 2 Compute AutoScale Group API AutoScale Group Message Bus … Region N Compute AutoScale Group API AutoScale Group Message Bus LB
  249. Configuration Management

  250. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  251. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  252. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  253. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  254. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  255. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  256. • No Hard-Coded IP Addresses in Config Files • Let

    DNS do What it Does Best • Use Environment Variables for Infrastructure Management • Converge Your Infrastructure Using a Central Service • Salt Cloud ( https://docs.saltstack.com/en/develop/topics/cloud/index.html ) • AWS CloudFormation ( https://aws.amazon.com/cloudformation/ ) • Chef ( https://www.chef.io/ ) • Puppet ( https://puppetlabs.com/ ) • Ansible ( http://www.ansible.com/ ) • Service Discovery • Consul ( https://www.consul.io/ ) Configuration Management Tips
  257. Test Your System as a Whole

  258. Test Your System as a Whole

  259. CI / CD • Use a CI / CD Pipeline

    • Show Love to Test-Driven Development • Don’t Forget Functional Tests and Integration Tests
  260. Continuously Keep Your Code In Ship Shape • ESLint (

    http://eslint.org ) • CodeClimate ( https://codeclimate.com/features ) • GreenKeeper ( http://greenkeeper.io ) • npm scripts (instead of Grunt or Gulp — YMMV)
 ( https://docs.npmjs.com/misc/scripts ) • npm outdated ( https://docs.npmjs.com/cli/outdated ) • git pre-commit hooks ( https://github.com/observing/pre-commit ) • [ hint: Install your development dependencies (such as eslint, babel, gulp, etc) locally, (not globally)! ]
  261. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  262. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  263. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  264. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  265. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  266. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  267. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  268. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  269. Are We Done Yet? Load Balancer Load Balancer Region 1

    Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB Region 2 Compute Service Compute Service redis API Service API Service redis Compute Service Compute Service API Service API Service LB replication replication Round-Robin DNS The Internet Message Bus Message Bus
  270. Making the Load Balancer HA * see also: https://www.wikiwand.com/en/Virtual_Router_Redundancy_Protocol Load

    Balancer client Load Balancer Load Balancer client keepalived active failover
  271. Making the Load Balancer HA * see also: https://www.wikiwand.com/en/Virtual_Router_Redundancy_Protocol Load

    Balancer client Load Balancer Load Balancer client keepalived active failover
  272. Making the Load Balancer Highly Available • round-robin DNS •

    https://www.wikiwand.com/en/Round-robin_DNS • heartbeat • https://www.wikiwand.com/en/Heartbeat_(computing) • keepalived • http://keepalived.org/ * You can use these tools to make any component HA.
  273. SSL Termination * * * https://github.com/bumptech/stud Load Balancer Load Balancer

    client keepalived active failover SSL Terminator SSL Terminator client keepalived active failover Load Balancer Load Balancer
  274. SSL Termination * * * https://github.com/bumptech/stud Load Balancer Load Balancer

    client keepalived active failover SSL Terminator SSL Terminator client keepalived active failover Load Balancer Load Balancer
  275. SSL Termination * * * https://github.com/bumptech/stud Load Balancer Load Balancer

    client keepalived active failover SSL Terminator SSL Terminator client keepalived active failover Load Balancer Load Balancer
  276. Make Redis and RabbitMQ Redundant redis redis (master) redis (read

    replica) redis (read replica) redis (read replica) redis (master) redis (read replica) redis (read replica) redis (read replica) round-robin DNS This will also increase throughput as a side benefit. See http://redis.io/topics/replication and http://redis.io/topics/ cluster-tutorial. You can also use a managed “memory as a service” solution. See also https://www.rabbitmq.com/ha.html for how a similar queue mirroring is implemented for a RabbitMQ cluster. 
 And similarly, you can use a managed “queue as a service” solution to ease your pain ;)
  277. Build Redundancy Everywhere

  278. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  279. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  280. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  281. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  282. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  283. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  284. Build Redundancy Everywhere Note This is more typically done by

    using a sidekick health checks of your service discovery tool. See https://www.consul.io/intro/getting-started/checks.html for details for example.
  285. Torture Your System • Try Chaos Monkey • https://github.com/Netflix/SimianArmy/wiki/Chaos-Monkey •

    Randomly send `kill -9` to Processes • Randomly Knock a Server Offline • Intentionally Run Out of Disk Space • Take an entire data center down
  286. Summary Σ

  287. Summary Stateless is Better than Stateful Eventual Consistency Build Redundancy

    Everywhere! Startup Fast, Shut Down Gracefully Solve Problems That Actually Exist
  288. Summary Never Assume, Always Measure Perf Before Scale Infrastructure is

    Code; Automate It! Keep Configuration Details in Environment Variables Show Love to DNS
  289. Summary • Know Your Ecosystem • Know Your Tools •

    Use Tools, not Rules!
  290. Scale 2 ∞ & 㱺 Region 1 Compute AutoScale Group

    API AutoScale Group LB LB DNS The Internet Message Bus Region 2 Compute AutoScale Group API AutoScale Group Message Bus … Region N Compute AutoScale Group API AutoScale Group Message Bus LB API Service Internet Bastion Simulated by an NGINX static web server * Fetch HTML off of websites * Simplify and convert the HTML to plain text * Do NLP/Tokenization on the plain text * Create tags as a result test API
  291. Thank You Questions?