Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WTF is Modeling, Anyway!?

WTF is Modeling, Anyway!?

A video conversation with performance and capacity management veteran, Boris Zibitsker, about how I saved a multi-million dollar computing platform, using a 1-line performance model (at 21:50 minutes). "Best practices" caused the problem.

Dr. Neil Gunther

August 02, 2017
Tweet

More Decks by Dr. Neil Gunther

Other Decks in Technology

Transcript

  1. WTF is “Modeling”, Anyway!? Video conversation with Boris Zibitsker on

    the BEZNext Channel Dr. Neil J. Gunther — @DrQz Performance Dynamics August 2, 2017 SM c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 1 / 14
  2. Types of Models This word “model” is overloaded in both

    english and technology: UML software modeling model train set Kim Kardashian financial/accounting models Amdahl’s law statistical regression numerical mesh simulation benchmark workload simulation support vector machines convolutional neural nets We need to specify clearly and unambiguously which model c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 3 / 14
  3. What is a Performance Model? A performance model is a

    mathematical framework used to assess the validity of performance data (an overlooked necessity) data + model = information 1 Select performance metrics as inputs: λ, R, S, Q, . . . 2 Model is a relationship between those metrics: Q = λ R 3 Model outputs are calculated metrics 4 Compare calculated metrics with (other) measured metrics 5 Repeat until satisfied Can then project metric values into circumstances that are not measured or not measureable c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 4 / 14
  4. A Real Simple Real Model c 2018 Performance Dynamics WTF

    is “Modeling”, Anyway!? August 2, 2017 5 / 14
  5. The Environment 5 Production Environment . . . S390 Robotic

    tape silos IBM AIX/SP-2 50 nodes IBM AIX/SP-2 50 nodes SP2 SP2 FDDI rings User Tek X-terminals c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 6 / 14
  6. The Data Problem: Home-grown application could take > 60 seconds

    to launch IBM cluster would cost $millions to replace c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 7 / 14
  7. Data Sample Table: Mean launch time in seconds Server Files

    Time Xfs1 8371 18.57 Xfs2 7113 16.72 NFS1 4781 17.01 NFS2 109 9.41 Observation: 109 files is nearly 128 = 27 Log2(128) = 7 is close to 9 seconds 4781 is near 4096 = 212 Log2(4096) = 12 is close to 17 seconds 8371 is near 8192 = 213 Log2(8192) = 13 is close to 18 seconds c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 8 / 14
  8. Visual Confirmation 1 10 100 1000 10000 5 10 15

    20 Log−Linear Plot Log (Number of files) Mean launch time (seconds) © 2017 Performance Dynamics Data LSQ fit c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 9 / 14
  9. Visual Confirmation 0 2000 4000 6000 8000 10000 0 5

    10 15 20 Log Model Number of files Mean launch time (seconds) © 2017 Performance Dynamics Data Model Data Model c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 10 / 14
  10. One-Line Performance Model Mean launch time R: R = k

    log10 (N) where N is the number of remote-server files and k = 4.57 is proportionality constant for base-10 logarithms Table: Log model of mean R times Remote server Measured seconds Log model %Error Xfs1 18.57 17.929948 3.446698 Xfs2 16.72 17.606686 -5.303144 NFS1 17.01 16.818079 1.128281 NFS2 9.41 9.312522 1.035894 Model is accurate to within 5% But where does logarithmic behavior come from? c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 11 / 14
  11. To get a log you need a tree 27 To

    Get a Log, You Need a Tree • • • • • • 0 1 2 1 10 100 Level Number this is of this logarithm c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 12 / 14
  12. Modeling Results 1 Saved $multi-million IBM SP2 cluster 2 SP2

    replacement would NOT have solved anything 3 Problem caused by “best practices” for system management 4 Performance management was completely overlooked 5 Font server held ∼15000 files but only ∼1000 needed 6 Simple log performance model told the whole story 7 Simple fix with no CapEx cost: prune the tree! 8 300% performance win in shortened launch times! 9 Log model more about explanation than prediction/forecasting c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 13 / 14
  13. Contact Information Performance Dynamics Company Castro Valley, California www.perfdynamics.com perfdynamics.blogspot.com

    facebook.com/PerformanceDynamics twitter.com/DrQz [email protected] +1-510-537-5758 c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 14 / 14