Performance Testing Serverless

stormforger.com - stormforger.com - stormforger.com stormforger.com - stormforger.com - stormforger.com

@tisba First, some Questions…

@tisba Who does not know what Serverless means?

@tisba Who is doing  Serverless in production?

@tisba Who is doing  performance testing?

@tisba ! AWS User Group CGN • Sebastian Cohnen (@tisba)
• 9+ years consulting & development • focus on performance and architecture • founder & CTO StormForger

StormForger Start Perf Testing Now! StormForger.com • Performance Testing SaaS
for DevOps Teams • Fully managed, focused on integration and continuous performance testing

Performance Testing  AWS Serverless Sebastian Cohnen, @tisba StormForger.com *SCNR* *SCNR*

@tisba Why perf testing Serverless?

@tisba Serverless? * … a gross simpliﬁcation Application State /
Data Runtime OS Serverless Application State / Data Runtime OS Containers Application State / Data Runtime OS … VM Networking Networking Networking … … … Application State / Data Runtime OS Bare Metal Networking Hardware … … … … … …

@tisba Serverless • …might have an impact on how you
build your systems • …systems will still be quite complex and distributed • …might be used for (µ-)Services

@tisba

@tisba Performance Testing Serverless! • Learning about system behaviour •
Correct sizing & cost optimisations  • Stateless • Issues with observability

@tisba Strategies • Get started! Pragmatism over Perfection • Divide
and Conquer • Two perspectives: “End-to-End” & “per Unit” • Move downward in the stack

@tisba End-to-End • Scenarios modelled around actual usage • Perspective
is a perimeter view • Can generate quite some noise when problems occur • Usually answers business questions best Unit • Test components or units in isolation • Very useful for teams to debug & troubleshoot performance issues • Good for checking technical SLO • Very hard to model in a way that is representative

@tisba Moving down the Stack

@tisba CloudFront API GW Lambda Test Traﬃc Moving down the
Stack

@tisba API GW Lambda CloudFront API GW Lambda Test Traﬃc
API GW Lambda DynamoDB Moving down the Stack

@tisba API GW Lambda API GW Lambda Test Traﬃc API
GW Lambda CloudFront Moving down the Stack DynamoDB ElastiCache

@tisba Observability

@tisba Can you quickly ﬁgure out the reason for  an
increased latency or error rate for a speciﬁc endpoint?

@tisba

@tisba Observability • Logs • Metrics • Tracing  • Check
out theburningmink.com, e.g.  https://theburningmonk.com/2018/04/ serverless-observability-what-can-you- use-out-of-the-box/

@tisba Performance Testing:  Exercise to check for Observability

@tisba Common Pitfalls

@tisba HTTP Keep-Alive • Using network connections for multiple requests 
• otherwise extremely wasteful in terms of resources • Keep-Alive signiﬁcantly decreases latency and resource utilisation connection: ~54%

@tisba Who is using Node.js?

@tisba HTTP Keep-Alive • Using network connections for multiple requests 
• otherwise extremely wasteful in terms of resources • Keep-Alive signiﬁcantly decreases latency and resource utilisation • e.g. Node.js does not keep alive HTTP client connections by default ⚠ connection: ~54%

@tisba HTTP Keep-Alive • AWS Lambda are stateless • State
is being externalised, often times over HTTP • Our upstream services are typically using HTTP • …and almost all AWS Services are talked to over HTTP as well!

@tisba Remember Observability? • Networking and OS-level operations are hard
to observe! • Instrumenting Network operations is next to impossible (AFAIK) • ephemeral port exhaustion, socket statistics, … Application State / Data Runtime OS … Serverless Networking … { ?

@tisba Let’s Test This!

@tisba definition.session("keep-alive", function(session) { session.times(25, function(context) { // HTTP 1.1
with HTTP Keep-Alive is the default context.get("http://testapp.loadtest.party/", { tag: "keep-alive" }); context.waitExp(0.5); }); });   definition.session("no-keep-alive", function(session) { session.times(25, function(context) { context.get("https://testapp.loadtest.party/", { tag: "no-keep-alive", // instruct client to close connection after request is done headers: { Connection: "close", }, }); context.waitExp(0.5); }); });

@tisba Only moments later… Only moments later…

@tisba bimodal distribution

@tisba time (ms) 0 300 600 900 1.200 median p99.9
max 1.158 242 122 262 85 29 HTTP Keep-Alive Impact* -76% -65% * Trafﬁc generated from two instances to a single target, over 5 minutes, 25 requests per connection (keep-alive), ~200k requests in total, ~337 TCP&TLS handshakes/sec, avg test cluster CPU utilisation ~50%. -77%

@tisba PS: Don’t forget about other connection pools ! RDS,
ElastiCache, …

@tisba Questions? Don’t be  that guy! Start Perf Testing Now!
StormForger.com

@tisba Bonus Level!

@tisba Private VPCs • You can use AWS Lambda that
are running in your private VPCs • Keep in mind that they allocate IP addresses from the VPC they are running in • If you are running out of address space (hello IPv4), weird things will happen!

@tisba Cascading Failures & Timeouts • Know your timeouts for
all layers • Circuit breakers, exponential back off • Ideally: Deadline & Cancellation Propagation https://landing.google.com/sre/sre-book/chapters/addressing-cascading-failures/

Performance Testing Serverless

Performance Testing Serverless

More Decks by Sebastian Cohnen

Other Decks in Technology

Featured

Transcript