$30 off During Our Annual Pro Sale. View Details »

WTF is Modeling, Anyway!?

WTF is Modeling, Anyway!?

A video conversation with performance and capacity management veteran, Boris Zibitsker, about how I saved a multi-million dollar computing platform, using a 1-line performance model (at 21:50 minutes). "Best practices" caused the problem.

Dr. Neil Gunther

August 02, 2017
Tweet

More Decks by Dr. Neil Gunther

Other Decks in Technology

Transcript

  1. WTF is “Modeling”, Anyway!?
    Video conversation with Boris Zibitsker on the BEZNext Channel
    Dr. Neil J. Gunther — @DrQz
    Performance Dynamics
    August 2, 2017
    SM
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 1 / 14

    View Slide

  2. c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 2 / 14

    View Slide

  3. Types of Models
    This word “model” is overloaded in both english and technology:
    UML software modeling
    model train set
    Kim Kardashian
    financial/accounting models
    Amdahl’s law
    statistical regression
    numerical mesh simulation
    benchmark workload simulation
    support vector machines
    convolutional neural nets
    We need to specify clearly and unambiguously which model
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 3 / 14

    View Slide

  4. What is a Performance Model?
    A performance model is a mathematical framework used to assess the
    validity of performance data (an overlooked necessity)
    data + model = information
    1 Select performance metrics as inputs: λ, R, S, Q, . . .
    2 Model is a relationship between those metrics: Q = λ R
    3 Model outputs are calculated metrics
    4 Compare calculated metrics with (other) measured metrics
    5 Repeat until satisfied
    Can then project metric values into circumstances that are not
    measured or not measureable
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 4 / 14

    View Slide

  5. A Real Simple
    Real Model
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 5 / 14

    View Slide

  6. The Environment
    5
    Production Environment
    . . .
    S390
    Robotic tape silos
    IBM AIX/SP-2 50 nodes
    IBM AIX/SP-2 50 nodes
    SP2 SP2
    FDDI
    rings
    User Tek X-terminals
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 6 / 14

    View Slide

  7. The Data
    Problem:
    Home-grown application could take > 60 seconds to launch
    IBM cluster would cost $millions to replace
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 7 / 14

    View Slide

  8. Data Sample
    Table: Mean launch time in seconds
    Server Files Time
    Xfs1 8371 18.57
    Xfs2 7113 16.72
    NFS1 4781 17.01
    NFS2 109 9.41
    Observation:
    109 files is nearly 128 = 27 Log2(128) = 7 is close to 9 seconds
    4781 is near 4096 = 212 Log2(4096) = 12 is close to 17 seconds
    8371 is near 8192 = 213 Log2(8192) = 13 is close to 18 seconds
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 8 / 14

    View Slide

  9. Visual Confirmation
    1 10 100 1000 10000
    5 10 15 20
    Log−Linear Plot
    Log (Number of files)
    Mean launch time (seconds)
    © 2017 Performance Dynamics
    Data
    LSQ fit
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 9 / 14

    View Slide

  10. Visual Confirmation
    0 2000 4000 6000 8000 10000
    0 5 10 15 20
    Log Model
    Number of files
    Mean launch time (seconds)
    © 2017 Performance Dynamics
    Data
    Model
    Data
    Model
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 10 / 14

    View Slide

  11. One-Line Performance Model
    Mean launch time R:
    R = k log10
    (N)
    where N is the number of remote-server files and k = 4.57 is
    proportionality constant for base-10 logarithms
    Table: Log model of mean R times
    Remote server Measured seconds Log model %Error
    Xfs1 18.57 17.929948 3.446698
    Xfs2 16.72 17.606686 -5.303144
    NFS1 17.01 16.818079 1.128281
    NFS2 9.41 9.312522 1.035894
    Model is accurate to within 5%
    But where does logarithmic behavior come from?
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 11 / 14

    View Slide

  12. To get a log you need a tree
    27
    To Get a Log, You Need a Tree
    • • •
    • • •
    0
    1
    2
    1
    10
    100
    Level Number
    this is of this
    logarithm
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 12 / 14

    View Slide

  13. Modeling Results
    1 Saved $multi-million IBM SP2 cluster
    2 SP2 replacement would NOT have solved anything
    3 Problem caused by “best practices” for system management
    4 Performance management was completely overlooked
    5 Font server held ∼15000 files but only ∼1000 needed
    6 Simple log performance model told the whole story
    7 Simple fix with no CapEx cost: prune the tree!
    8 300% performance win in shortened launch times!
    9 Log model more about explanation than prediction/forecasting
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 13 / 14

    View Slide

  14. Contact Information
    Performance Dynamics Company
    Castro Valley, California
    www.perfdynamics.com
    perfdynamics.blogspot.com
    facebook.com/PerformanceDynamics
    twitter.com/DrQz
    [email protected]
    +1-510-537-5758
    c 2018 Performance Dynamics WTF is “Modeling”, Anyway!? August 2, 2017 14 / 14

    View Slide