Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FaaS Measurement Fundamentals

FaaS Measurement Fundamentals

"Don't Worry About Servers, Still Worry About Metrics: FaaS Measurement Fundamentals"

Talk at Gluecon 2017.

Clay Smith

May 24, 2017

More Decks by Clay Smith

Other Decks in Programming


  1. Don't Worry About Servers Still Worry About Metrics FaaS Measurement

    Fundamentals @smithclay New Relic Gluecon 5/24/17
  2. Metrics are what we measure *hopefully useful things

  3. λ This Thing Appeared The Magic Hat of Werner Vogels

    How do we understand it?
  4. Hyped tech wish list Metrics Trends Alerting CTA Logging Detail

    Tracing Cause Analytics All of the above
  5. MALTA Observability Index for FaaS Metrics Alerting Logging Tracing Analytics

    Maturity Level
  6. Built-in FaaS Metrics* Error Count Function Invocation Count Function Duration

    (ms) * not comprehensive, but the important ones
  7. Why does function invocation time vary so much?

  8. Event Trigger 1. Invoke λ 2. Run 3. End Result

    Error Timeout or or
  9. Cold Start vs Warm Start Event Trigger Handler Code Warm

    Function Invocation Time Create Initialize Handler Code Cold
  10. What's inside AWS Lambda? "It's containers" — Person waving their

  11. λ: Running Commands for Discovery const exec = require('child_process').exec; exports.handler

    = (trigger, cb) => { exec('whoami', (err, stdout) => { console.log(stdout); return cb(null); }); } [LOG TIME] sbxuser_1066
  12. λ is a UNIX system?! I know this!

  13. Let's run SSH in λ λ ssh process SSH Tunnel

    Firewall: no inbound ports
  14. SSH in Lambda Architecture λ node.js wrapper go sshd binary

    (x64) Go SSH Crypto Libs process.exec() https://github.com/smithclay/faassh
  15. Max Session Length: 5 minutes (custom prompt optional) https://github.com/smithclay/faassh

  16. Info from /proc 2x Intel(R) Xeon(R) CPU E5-2666 v3 @

    2.90GHz cat /proc/cpuinfo 3857664 kB cat /proc/meminfo ixgbevf (EC2 10Gbps Network Driver) cat /proc/modules c4.large EC2 Compute-Optimized Instance (?)
  17. c4.large instance λ λ λ λ λ λ λ λ

    λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λs in theory on a VM 10 Gbps λ = Running 128 MB Function λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ = Not Running 128 MB Function
  18. Frozen functions help avoid cold starts https://www.kernel.org/doc/Documentation/cgroup-v1/freezer- subsystem.txt cgroup freezer

    subsystem λ λ λ
  19. Internals Recap It's just containers on a VM Functions frozen

    when not running No magic unikernels involved
  20. How do we measure and prevent cold starts? "Use Kubernetes"

    — Troll
  21. Cold Start Discovery var SO_SO_COLD = true; exports.handler = function(trigger,

    cb) { console.log('Cold? %s', SO_SO_COLD); SO_SO_COLD = false; return callback(cb) } https://github.com/smithclay/lambda-proc-info
  22. Warming automation Scheduled Event λ Only effective for non- concurrent

    execution! 4 minute interval http://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html
  23. Sending cold start events to analytics λ console.log(coldStart) Logs λ

    POST to Event DB (Insights) CloudWatch Log Filter Trigger
  24. Cold Starts Visualized ~7 hrs ~8 hrs

  25. λ Host Uptime Cold starts happen when hosts change! ~8hrs

  26. λ Host Subnet Hopping 10.13 10.12 10.11 10.13 10.12 10.12

    10.12 10. 11 # of AZs in us-west-2: 3
  27. What's the maximum concurrency of your function? one > 1

    A scheduled event will warm it until host retires. More advanced strategy needed*
  28. "Advanced" Strategy for i in `seq 1 $NUM_EXECUTIONS`; do echo

    "[$i] Executing $AWS_LAMBDA_FUNCTION_NAME..." aws lambda invoke ... done https://gist.github.com/smithclay/e89dfe35fe2a4938db56bb12df76777c
  29. Multiple containers running on a single host to serve parallel

    requests Tracking /proc/sys/kernel/random/boot_id and hostname Confirmed: cold start happens on container init.
  30. So is this just a container PaaS? High-availability/multiple zones Elastic

    fleet of compute-optimized VMs A very good scheduling algorithm Design (freezing, limits, etc) for very fast invocation Only if your PaaS has...
  31. FaaS in Production Reality λ λ Dev Prod Orchestration (!?)

    Version/Deploy Monitoring Security Cold Start Mgmt The "learning cliff" Great tweet from @mfdii
  32. FaaS Isn't a Silver Bullet "I've got a fast, computationally-intensive

    task that I need to perform occasionally in response to a well-defined event that isn't that sensitive to latency." —The Ideal FaaS Developer // TO DO: measure && share results
  33. Thanks. @smithclay New Relic Gluecon 5/24/17