Lessons Learned from Benchmarking Serverless Workloads

Slide 1

Slide 1 text

Lessons Learned from Benchmarking Serverless Workloads Diana Arroyo & Alek Slominski IBM Research

Slide 2

Slide 2 text

Serverless: Quo Vadis? •  We hope our serverless workload generator is useful tool for anybody looking on serverless services and want to compare them in more depth. •  Planning to write blog post about the experience of serverless workload generaIon and benchmarking and open source the benchmark. If you have some quesIons, feedback, or want to tell me where I am wrong? Aleksander Slominski @aslom (tweet me!) aslom email at us.ibm.com

Slide 3

Slide 3 text

Overview •  Serverless workload characterisIcs •  GeneraIng serverless workloads •  Results •  Lessons learned

Slide 4

Slide 4 text

Serverless Workload CharacterisIcs •  Serverless workloads can require thousands of concurrent short lived containers to be created and destroyed in milliseconds: •  Container aka AcIon aka FuncIon aka …. – Depends of servless service, framework, ... •  Required operaIons: – Start lot of acIons – Generate work: start, do work, stop and repeat – AcIons run for some Ime to allow for reuse (cold vs. hot) and must be fast!

Slide 5

Slide 5 text

Serverless Workload Benchmark Goals •  Simulate lifecycle of serverless acIon as it takes part in serverless workload •  Minimal scenario: –  Test serverless acIon start Ime –  Send 1 .. N requests and validate response –  Pause / Resume acIon as needed –  Stop (kill) acIons •  Set scenario of how many acIons are started, when, for how long etc. •  Workload runs mulIple scenario (in sequence, parallel etc.) •  Gather staIsIcs about workload execuIon –  Enough to learn how well test environments are handling high such scenarios?

Slide 6

Slide 6 text

Simple Scenario: WebSocket AcIon Test driver (overall workload) Scenario Instance 1 Scenario Instance 2 Scenario Instance S … … AcIon … WebSocket WebSocket

Slide 7

Slide 7 text

Simple Setup Scenario Setup: Docker •  Start test driver container when it starts running it opens listening sockets and starts S scenario containers –  docker run driver –e setup_for_scenario_containers) •  Each scenario container connects using websocket to test driver and starts A acIon containers –  docker run scenario –e setup_for_acIon_containers –e WS_CALLBACK=ws://test_driver:port) •  Each acIon container when started connects using websocket back to scenario container to ask for requests –  docker run hello-acIon WS_CALLBACK=ws://scenario:port)

Slide 8

Slide 8 text

Simple Scenario ExecuIon ExecuIon: •  The test driver container aèr starIng S scenario containers waits on a websocket for results from scenario containers •  Each scenario container aèr starIng A acIon containers waits on a websocket from an acIon containers and then starts sending N requests and waits for responses •  Each acIon containers aèr starIng sends “ready” over websocket and then waits for requests, processes each request (sleep for M milliseconds) and sends response back End result: •  1 + S + S*A containers running (driver container + scenario containers + acIon containers) •  N * S *A requests processed •  DuraIon: ideal Ime (zero startup Ime): N * M milliseconds

Slide 9

Slide 9 text

Preliminary results •  Benchmark is not ﬁnished and results are not ﬁnal •  Lot of tuning opportuniIes … •  We would like to make available benchmark source code so anybody can run it and modify workload to suit there needs –  Let us know! Lies, damned lies, and sta:s:cs, and benchmarks!

Slide 10

Slide 10 text

Can you run it? Lessons Learned •  The lessons learned when running these workloads •  How well target system handle workload? •  Start simple: what if we run serverless workload using runIme: – Docker Engine – Swarm – Kubernetes

Slide 11

Slide 11 text

More complex environments •  Mesos Cloud services: •  AWS Lambda •  Azure FuncIons •  Google CloudFuncIons •  IBM OpenWhisk •  Other?

Slide 12

Slide 12 text

Lessons learned •  Scaling becomes harder as size increases –  We can run easily 100s but run into issues when running 1000 containers •  LimitaIons in Docker engine –  It seems we hit some limits on how many processes can be started per second –  Different in different versions of Docker •  Locking in Swarm –  Experienced with Mesos (different Iming of some operaIons leads to deadlocks ….)

Slide 13

Slide 13 text

OpenWhisk •  Overview: hjps://developer.ibm.com/openwhisk/ •  Open Source: hjps://github.com/openwhisk •  Slack channel: hjps://github.com/openwhisk/openwhisk/wiki

Slide 14

Slide 14 text

API HARMONY ﬁnd, learn about, and use web APIs hjp://apiharmony-open.mybluemix.net Blog hjp://www.apiful.io/ Frequently used Instagram endpoints Top examples capturing these lessons Other developers calling this endpoint also call … Other developers frequently used these response fields . Frequently used parameters and common values

Slide 15

Slide 15 text

More Results for MesosCon EU •  We will present our experience running workload experiments in Mesos, share Mesos tuning Ips and workload generaIon code to make Mesos an ideal plalorm for serverless workloads: •  hjp://sched.co/7opW Wednesday, August 31 2:00pm - 2:50pm Amsterdam