Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predictability of Performance in Public Clouds - Some Empirical Data and Lessons Learned for Software Performance Testing

xLeitix
April 23, 2017

Predictability of Performance in Public Clouds - Some Empirical Data and Lessons Learned for Software Performance Testing

Invited Presentation at the LTB Workshop 2017 @ ICPE

xLeitix

April 23, 2017
Tweet

More Decks by xLeitix

Other Decks in Research

Transcript

  1. software evolution & architecture lab University of Zurich, Switzerland Predictability

    of Performance in Public Clouds Some Empirical Data and Lessons Learned for Software Performance Testing Dr. Philipp Leitner @xLeitix
  2. What’s the problem? 20 30 40 0 20 40 60

    Measurement Runtime [h] IO Bandwidth [Mb/s] Instance 9097 Instance 14704
  3. What’s the problem? 20 30 40 0 20 40 60

    Measurement Runtime [h] IO Bandwidth [Mb/s] Instance 9097 Instance 14704 Two identical instances - very different performance
  4. What’s the problem? 20 30 40 0 20 40 60

    Measurement Runtime [h] IO Bandwidth [Mb/s] Instance 9097 Instance 14704 Same instance over time - also very different performance
  5. Two Kinds of Predictability • Inter-Instance Predictability “How similar is

    the performance of multiple identical instances?” • Intra-Instance Predictability “How self-similar is the performance of a single instance over time?”
  6. Data Collection Approach - Tooling J. Scheuner, P. Leitner, J.

    Cito and H.C. Gall: Cloud Work Bench - Infrastructure-as-Code Based Cloud Benchmarking 2014 IEEE 6th International Conference on Cloud Computing Technology and Science, Singapore, 2014, pp. 246-253. doi: 10.1109/CloudCom.2014.98 Code: https://github.com/ sealuzh/cloud- workbench Demo: https:// www.youtube.com/ watch?v=0yGFGvHvobk Collected ~ 54000 data points over 2 months
  7. Hardware Heterogeneity CPU Models we observed in our tests (for

    m1.small and Azure Small in North America)
  8. 0.0 2.5 5.0 7.5 10.0 Baremetal Cloud generated_serialize_1_int_field ops /

    sec What does this mean for performance testing? Examples from microbenchmarking io.protostuff on baremetal versus a single cloud instance on GCE 0 10 20 30 40 50 Baremetal Cloud runtime_serialize_1_int_field ops / sec
  9. A/A Testing Basic idea: Compare two identical configurations, (hopefully) observe

    no diff. Runs 1 2 3 4 5 1 2 3 4 5 runtime_serialize_1_int_field p-values
  10. A/A Testing Basic idea: Compare two identical configurations, (hopefully) observe

    no diff. Runs 1 2 3 4 5 1 X 0,001041 0,000472 0,04298 0,2211 2 0,001041 X 0,862 0,4291 0,003211 3 0,000472 0,862 X 0,4135 0,007995 4 0,04298 0,4291 0,4135 X 0,04909 5 0,2211 0,003211 0,007995 0,04909 X runtime_serialize_1_int_field
  11. Mitigation Strategies (Level 0) Use baremetal / dedicated if feasible

    (Level 1) A/A testing is key (Level 2) Test different providers (Level 3) Scale up
  12. Mitigation Strategies (Level 0) Use baremetal / dedicated if feasible

    (Level 1) A/A testing is key (Level 2) Test different providers (Level 3) Scale up (Level 4) Experiment interleaving
  13. Experiment Interleaving Instance V1 V2 V1 V2 V1 … Listen

    to this talk @ ICPE for more info and experiments :)
  14. Mitigation Strategies (Level 0) Use baremetal / dedicated if feasible

    (Level 1) A/A testing is key (Level 2) Test different providers (Level 3) Scale up (Level 4) Experiment interleaving (Level 5) Admit defeat (fine-grained diffs might just not be discoverable for you)
  15. software evolution & architecture lab • Inter-Instance Predictability • Intra-Instance

    Predictability Summary Philipp Leitner @xLeitix Main contacts: Jürgen Cito @citostyle Christoph Laaber @chrstphlbr Joel Scheuner @joe4dev 20 30 40 0 20 40 60 Measurement Runtime [h] IO Bandwidth [Mb/s] Instance 9097 Instance 14704
  16. software evolution & architecture lab Dr. Philipp Leitner University of

    Zurich, Switzerland Summary Some insights from benchmarking EC2, GCE, Azure, and Softlayer More info: Philipp Leitner and Jürgen Cito. 2016. Patterns in the Chaos — A Study of Performance Variation and Predictability in Public IaaS Clouds. ACM Trans. Internet Technol. 16, 3, Article 15 (April 2016), 23 pages. DOI: http://dx.doi.org/10.1145/2885497 Philipp Leitner @xLeitix Main contacts: Jürgen Cito @citostyle Christoph Laaber @chrstphlbr Joel Scheuner @joe4dev