FaaS Measurement Fundamentals

Don't Worry About Servers Still Worry About Metrics FaaS Measurement
Fundamentals @smithclay New Relic Gluecon 5/24/17

Metrics are what we measure *hopefully useful things

λ This Thing Appeared The Magic Hat of Werner Vogels
How do we understand it?

Hyped tech wish list Metrics Trends Alerting CTA Logging Detail
Tracing Cause Analytics All of the above

MALTA Observability Index for FaaS Metrics Alerting Logging Tracing Analytics
Maturity Level

Built-in FaaS Metrics* Error Count Function Invocation Count Function Duration
(ms) * not comprehensive, but the important ones

Why does function invocation time vary so much?

Event Trigger 1. Invoke λ 2. Run 3. End Result
Error Timeout or or

Cold Start vs Warm Start Event Trigger Handler Code Warm
Function Invocation Time Create Initialize Handler Code Cold

What's inside AWS Lambda? "It's containers" — Person waving their
hands

λ: Running Commands for Discovery const exec = require('child_process').exec; exports.handler
= (trigger, cb) => { exec('whoami', (err, stdout) => { console.log(stdout); return cb(null); }); } [LOG TIME] sbxuser_1066

λ is a UNIX system?! I know this!

Let's run SSH in λ λ ssh process SSH Tunnel
Firewall: no inbound ports

SSH in Lambda Architecture λ node.js wrapper go sshd binary
(x64) Go SSH Crypto Libs process.exec() https://github.com/smithclay/faassh

Max Session Length: 5 minutes (custom prompt optional) https://github.com/smithclay/faassh

Info from /proc 2x Intel(R) Xeon(R) CPU E5-2666 v3 @
2.90GHz cat /proc/cpuinfo 3857664 kB cat /proc/meminfo ixgbevf (EC2 10Gbps Network Driver) cat /proc/modules c4.large EC2 Compute-Optimized Instance (?)

c4.large instance λ λ λ λ λ λ λ λ
λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λs in theory on a VM 10 Gbps λ = Running 128 MB Function λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ λ = Not Running 128 MB Function

Frozen functions help avoid cold starts https://www.kernel.org/doc/Documentation/cgroup-v1/freezer- subsystem.txt cgroup freezer
subsystem λ λ λ

Internals Recap It's just containers on a VM Functions frozen
when not running No magic unikernels involved

How do we measure and prevent cold starts? "Use Kubernetes"
— Troll

Cold Start Discovery var SO_SO_COLD = true; exports.handler = function(trigger,
cb) { console.log('Cold? %s', SO_SO_COLD); SO_SO_COLD = false; return callback(cb) } https://github.com/smithclay/lambda-proc-info

Warming automation Scheduled Event λ Only effective for non- concurrent
execution! 4 minute interval http://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html

Sending cold start events to analytics λ console.log(coldStart) Logs λ
POST to Event DB (Insights) CloudWatch Log Filter Trigger

Cold Starts Visualized ~7 hrs ~8 hrs

λ Host Uptime Cold starts happen when hosts change! ~8hrs

λ Host Subnet Hopping 10.13 10.12 10.11 10.13 10.12 10.12
10.12 10. 11 # of AZs in us-west-2: 3

What's the maximum concurrency of your function? one > 1
A scheduled event will warm it until host retires. More advanced strategy needed*

"Advanced" Strategy for i in `seq 1 $NUM_EXECUTIONS`; do echo
"[$i] Executing $AWS_LAMBDA_FUNCTION_NAME..." aws lambda invoke ... done https://gist.github.com/smithclay/e89dfe35fe2a4938db56bb12df76777c

Multiple containers running on a single host to serve parallel
requests Tracking /proc/sys/kernel/random/boot_id and hostname Confirmed: cold start happens on container init.

So is this just a container PaaS? High-availability/multiple zones Elastic
fleet of compute-optimized VMs A very good scheduling algorithm Design (freezing, limits, etc) for very fast invocation Only if your PaaS has...

FaaS in Production Reality λ λ Dev Prod Orchestration (!?)
Version/Deploy Monitoring Security Cold Start Mgmt The "learning cliff" Great tweet from @mfdii

FaaS Isn't a Silver Bullet "I've got a fast, computationally-intensive
task that I need to perform occasionally in response to a well-defined event that isn't that sensitive to latency." —The Ideal FaaS Developer // TO DO: measure && share results

Thanks. @smithclay New Relic Gluecon 5/24/17

FaaS Measurement Fundamentals

FaaS Measurement Fundamentals

Clay Smith

More Decks by Clay Smith

Other Decks in Programming

Featured

Transcript

Don't Worry About Servers Still Worry About Metrics FaaS Measurement

Metrics are what we measure *hopefully useful things

λ This Thing Appeared The Magic Hat of Werner Vogels

Hyped tech wish list Metrics Trends Alerting CTA Logging Detail

MALTA Observability Index for FaaS Metrics Alerting Logging Tracing Analytics

Built-in FaaS Metrics* Error Count Function Invocation Count Function Duration

Why does function invocation time vary so much?

Event Trigger 1. Invoke λ 2. Run 3. End Result

Cold Start vs Warm Start Event Trigger Handler Code Warm

What's inside AWS Lambda? "It's containers" — Person waving their

λ: Running Commands for Discovery const exec = require('child_process').exec; exports.handler

λ is a UNIX system?! I know this!

Let's run SSH in λ λ ssh process SSH Tunnel

SSH in Lambda Architecture λ node.js wrapper go sshd binary

Max Session Length: 5 minutes (custom prompt optional) https://github.com/smithclay/faassh

Info from /proc 2x Intel(R) Xeon(R) CPU E5-2666 v3 @

c4.large instance λ λ λ λ λ λ λ λ

Frozen functions help avoid cold starts https://www.kernel.org/doc/Documentation/cgroup-v1/freezer- subsystem.txt cgroup freezer

Internals Recap It's just containers on a VM Functions frozen

How do we measure and prevent cold starts? "Use Kubernetes"

Cold Start Discovery var SO_SO_COLD = true; exports.handler = function(trigger,

Warming automation Scheduled Event λ Only effective for non- concurrent

Sending cold start events to analytics λ console.log(coldStart) Logs λ

Cold Starts Visualized ~7 hrs ~8 hrs

λ Host Uptime Cold starts happen when hosts change! ~8hrs

λ Host Subnet Hopping 10.13 10.12 10.11 10.13 10.12 10.12

What's the maximum concurrency of your function? one > 1

"Advanced" Strategy for i in `seq 1 $NUM_EXECUTIONS`; do echo

Multiple containers running on a single host to serve parallel

So is this just a container PaaS? High-availability/multiple zones Elastic

FaaS in Production Reality λ λ Dev Prod Orchestration (!?)

FaaS Isn't a Silver Bullet "I've got a fast, computationally-intensive

Thanks. @smithclay New Relic Gluecon 5/24/17