Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Superlinear Speedup: The Perpetual Motion of Parallel Performance

Dr. Neil Gunther
March 05, 2013
29

Superlinear Speedup: The Perpetual Motion of Parallel Performance

First public presentation (2013) of superlinear performance analyzed using the universal scalability law (USL). Includes the Payback theorem at the end.

Dr. Neil Gunther

March 05, 2013
Tweet

Transcript

  1. Superlinear Speedup
    The Perpetual Motion of Parallel Performance
    Dr. Neil Gunther
    Performance Dynamics
    Hotsos Symposium
    March 5, 2013
    SM
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 1 / 64

    View Slide

  2. Outline
    Quick review 20 years of USL scalability analysis
    Appearance of “super linear” data starting c. 2010:
    Some users complain USL doesn’t work for superlinearity!
    But precious little correct data (e.g., none on Wikipedia)
    Likely to see more superlinearity in distributed systems
    Can’t just ignore it or people will abandon USL
    Super linear speedup described on Wikipedia (must be true)
    Add 3rd parameter to USL:
    To fit superlinear data
    Headache the size of an elephant
    April 2012 discovered stunningly simple result
    No modification to USL equation (Huh?)
    Ramifications for scalability analysis are quite profound
    Like perpetual motion, if it’s too good to be true...
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 2 / 64

    View Slide

  3. Review of USL
    Outline
    1 Review of USL
    2 Application of USL
    Memcache
    Varnish
    Postgres
    3 Superlinearity
    Something for nothing
    Mathematica modeling
    Postgres 9.2FL superlinearity
    4 Summary
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 3 / 64

    View Slide

  4. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  5. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  6. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals”
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  7. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals”
    Hotsos 2011: “Brooks, Cooks, and Response Time Scalability”
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  8. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals”
    Hotsos 2011: “Brooks, Cooks, and Response Time Scalability”
    Equal bang for the buck: linear concurrency
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  9. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals”
    Hotsos 2011: “Brooks, Cooks, and Response Time Scalability”
    Equal bang for the buck: linear concurrency
    Diminishing Returns: contention overhead
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  10. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals”
    Hotsos 2011: “Brooks, Cooks, and Response Time Scalability”
    Equal bang for the buck: linear concurrency
    Diminishing Returns: contention overhead
    Negative return on investment: coherency overhead
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  11. Review of USL
    How to Quantify Scalability
    Previous USL presentations at Hotsos:
    Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing”
    Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals”
    Hotsos 2011: “Brooks, Cooks, and Response Time Scalability”
    Equal bang for the buck: linear concurrency
    Diminishing Returns: contention overhead
    Negative return on investment: coherency overhead
    Calculate scalability curve from performance measurements
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

    View Slide

  12. Review of USL
    Also ended up in my books
    Chapters 6 and 14 Chapters 4–6
    Also check out:
    Special USL web page
    Guerrilla perf and CaP classes
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 5 / 64

    View Slide

  13. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  14. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    C(N) relative capacity function of N
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  15. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    C(N) relative capacity function of N
    But what function?
    CN(α, β) =
    N
    1 + α (N − 1) + β N(N − 1)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  16. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    C(N) relative capacity function of N
    But what function?
    CN(α, β) =
    N
    1 + α (N − 1) + β N(N − 1)
    Three Cs:
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  17. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    C(N) relative capacity function of N
    But what function?
    CN(α, β) =
    N
    1 + α (N − 1) + β N(N − 1)
    Three Cs:
    1
    Concurrency
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  18. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    C(N) relative capacity function of N
    But what function?
    CN(α, β) =
    N
    1 + α (N − 1) + β N(N − 1)
    Three Cs:
    1
    Concurrency
    2
    Contention (0 < α < 1)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  19. Review of USL
    Universal Scalability Law (USL)
    N virtual users or processes provide load
    C(N) relative capacity function of N
    But what function?
    CN(α, β) =
    N
    1 + α (N − 1) + β N(N − 1)
    Three Cs:
    1
    Concurrency
    2
    Contention (0 < α < 1)
    3
    Coherency (0 < β < 1)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

    View Slide

  20. Review of USL
    Concave shape of USL function
    Xdata(N)
    Xdata(1)
    → CN(α, β) =
    N
    1 + α(N − 1) + βN(N − 1)
    0 2 4 6 8 10
    N
    0.2
    0.4
    0.6
    0.8
    1.0
    1.2
    1.4
    C Α,Β
    Handles scalability degradation (universal)
    Goal is to get rid of scalability maximum
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 7 / 64

    View Slide

  21. Review of USL
    How do we determine α and β?
    C(N) =
    N
    1 + α (N − 1) + β N(N − 1)
    Gene Amdahl (1967): brute force measurement for α
    Clever way: Apply statistical regression
    I will use R:
    FOSS package with 40 yr history (since S at Bell Labs)
    Sophisticated/accurate statistical tools
    Interpreted programming language (cf. Mathematica)
    Magic functions in R:
    nls() nonlinear LSQ fit (α, β in one swell foop)
    optimize() to estimate X(1) if missing
    predict() for smooth interpolation/extrapolation from data
    plot() with many variants
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 8 / 64

    View Slide

  22. Application of USL
    Outline
    1 Review of USL
    2 Application of USL
    Memcache
    Varnish
    Postgres
    3 Superlinearity
    Something for nothing
    Mathematica modeling
    Postgres 9.2FL superlinearity
    4 Summary
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 9 / 64

    View Slide

  23. Application of USL Memcache
    Memcache
    Joint work with
    S. Subramanyam (Sun, USA) and S. Parvu (Sun, FI)
    Presented at Velocity 2010 conference
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 10 / 64

    View Slide

  24. Application of USL Memcache
    Memcache Scalability
    Scaleup Scaleout
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 11 / 64

    View Slide

  25. Application of USL Memcache
    Memcache: Scaleout strategy
    Distributed cache of key-value pairs
    Pre-loaded from RDBMS
    Tier of cheap, older CPUs (e.g., not multicore)
    Single threading ok, until next hardware roll
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 12 / 64

    View Slide

  26. Application of USL Memcache
    Memcache: measurements
    Example (Read in raw data and plot it)
    input <- read.table(fname,header=TRUE,sep="\t")
    print(input)
    plot(input$N,input$X_N,type="b")
    2 4 6 8 10 12 14
    100 150 200 250 300 350
    Raw data for memcached 132
    input$N
    input$X_N
    Typing input into R console:
    > input
    N X_N
    1 1 89
    2 2 160
    3 4 272
    4 8 333
    5 10 352
    6 12 339
    7 14 315
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 13 / 64

    View Slide

  27. Application of USL Memcache
    Memcache: nonlinear regression
    Example (Normalize, check efficiencies, fit USL)
    > input
    N X_N Norm Effcy
    1 1 89 1.000000 1.0000000
    2 2 160 1.797753 0.8988764
    3 4 272 3.056180 0.7640449
    4 8 333 3.741573 0.4676966
    5 10 352 3.955056 0.3955056
    6 12 339 3.808989 0.3174157
    7 14 315 3.539326 0.2528090
    Formula: Norm ˜ N/(1 + alpha * (N - 1) + beta * N * (N - 1))
    Parameters:
    Estimate Std. Error t value Pr(>|t|)
    alpha 0.063520 0.011433 5.556 0.002597 **
    beta 0.011323 0.001063 10.649 0.000126 ***
    ---
    Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
    Residual standard error: 0.07824 on 5 degrees of freedom
    Algorithm "port", convergence message: relative convergence (4)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 14 / 64

    View Slide

  28. Application of USL Memcache
    Memcache: scalability analysis
    0 2 4 6 8 10 12 14
    0 50 100 150 200 250 300 350
    Threads (N)
    Throughput X(N) in KOps/s
    USL Scalability Analysis of 'memcached 132' Data
    ! = 0.0635
    ! = 0.011323
    R2
    = 0.9961
    Nmax = 9.09
    Xmax = 344.76
    Xroof = 1401.13
    Z(sec) = 0
    TS = 3001131110
    Created by NJG on
    Wed Jan 30 11:10:32 2013
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 15 / 64

    View Slide

  29. Application of USL Varnish
    Varnish
    Data by D. Popa (DigitAir, RO) via S. Parvu (Nokia, FI)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 16 / 64

    View Slide

  30. Application of USL Varnish
    Varnish: architecture
    HTTP accelerator
    Reverse web proxy caching system
    Sits in front of classic web server
    Caching handled by virtual memory
    Highly scalable (linear)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 17 / 64

    View Slide

  31. Application of USL Varnish
    Varnish: measurements
    Example (Read in raw data and plot it)
    input <- read.table(fname,header=TRUE,sep="\t")
    print(input)
    plot(input$N,input$X_N,type="b")
    0 100 200 300 400
    0 100 200 300 400 500
    Raw data: Varnish
    input$N
    input$X_N
    By typing input into R console:
    > input
    N X_N
    1 1 1.4
    2 2 2.7
    3 5 6.4
    4 10 12.8
    5 25 32.0
    6 50 64.0
    7 75 98.0
    8 100 131.0
    9 150 197.0
    10 250 320.0
    11 300 392.0
    12 400 518.0
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 18 / 64

    View Slide

  32. Application of USL Varnish
    Varnish: nonlinear regression
    Example (Fit to USL model)
    N X_N Norm Effcy
    1 1 1.4 1.000000 1.0000000
    2 2 2.7 1.928571 0.9642857
    3 5 6.4 4.571429 0.9142857
    4 10 12.8 9.142857 0.9142857
    5 25 32.0 22.857143 0.9142857
    6 50 64.0 45.714286 0.9142857
    7 75 98.0 70.000000 0.9333333
    8 100 131.0 93.571429 0.9357143
    9 150 197.0 140.714286 0.9380952
    10 250 320.0 228.571429 0.9142857
    11 300 392.0 280.000000 0.9333333
    12 400 518.0 370.000000 0.9250000
    Formula: Norm ˜ N/(1 + alpha * (N - 1) + beta * N * (N - 1))
    Parameters:
    Estimate Std. Error t value Pr(>|t|)
    alpha 5.721e-04 7.220e-05 7.924 1.28e-05 ***
    beta -9.414e-07 1.978e-07 -4.759 0.000769 *** <<<<<<< beta < 0
    ---
    Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
    Residual standard error: 2.078 on 10 degrees of freedom
    Number of iterations to convergence: 11
    Achieved convergence tolerance: 1.199e-07
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 19 / 64

    View Slide

  33. Application of USL Varnish
    Varnish: USL analysis with β < 0
    0 100 200 300 400
    0 100 200 300 400 500
    USL Fit to Varnish
    Load (N)
    Throughput X(N)
    ! = 6e-04
    ! = !1e-06
    R2
    = 0.9997
    Nmax = NaN
    Xmax = NaN
    Xroof = 2447.13
    Z(sec) = 0
    TS = 3001131837
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 20 / 64

    View Slide

  34. Application of USL Varnish
    Varnish: USL convex projection
    0 200 400 600 800 1000
    0 500 1000 1500 2000
    USL bogus projection for Varnish
    Load (N)
    Throughput X(N)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 21 / 64

    View Slide

  35. Application of USL Varnish
    Varnish: nonlinear regression
    Example (Fit to β = 0 model)
    > input
    N X_N Norm Effcy
    1 1 1.4 1.000000 1.0000000
    2 2 2.7 1.928571 0.9642857
    3 5 6.4 4.571429 0.9142857
    4 10 12.8 9.142857 0.9142857
    5 25 32.0 22.857143 0.9142857
    6 50 64.0 45.714286 0.9142857
    7 75 98.0 70.000000 0.9333333
    8 100 131.0 93.571429 0.9357143
    9 150 197.0 140.714286 0.9380952
    10 250 320.0 228.571429 0.9142857
    11 300 392.0 280.000000 0.9333333
    12 400 518.0 370.000000 0.9250000
    Formula: Norm ˜ N/(1 + alpha * (N - 1)) <<<<<<< beta=0 model ****
    Parameters:
    Estimate Std. Error t value Pr(>|t|)
    alpha 0.0002361 0.0000218 10.84 3.3e-07 ***
    ---
    Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
    Residual standard error: 3.617 on 11 degrees of freedom
    Number of iterations to convergence: 5
    Achieved convergence tolerance: 9.72e-08
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 22 / 64

    View Slide

  36. Application of USL Varnish
    Varnish: USL β = 0 analysis
    0 100 200 300 400
    0 100 200 300 400 500
    USL Fit to Varnish
    Load (N)
    Throughput X(N)
    ! = 2e-04
    ! = 0
    R2
    = 0.9992
    Nmax = NaN
    Xmax = NaN
    Xroof = 5928.53
    Z(sec) = NaN
    TS = 3001131618
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 23 / 64

    View Slide

  37. Application of USL Varnish
    Varnish: concave scalability projections
    0 1000 2000 3000 4000 5000
    0 1000 2000 3000 4000 5000 6000
    USL Projections for Varnish
    Load (N)
    Throughput X(N)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 24 / 64

    View Slide

  38. Application of USL Postgres
    Postgres
    Data via R. Haas (EnterpriseDB, MA)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 25 / 64

    View Slide

  39. Application of USL Postgres
    Postgres: PG 9.x measurements
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 26 / 64

    View Slide

  40. Application of USL Postgres
    Postgres: PG 9.x scalability analysis
    0 20 40 60 80
    0 10000 20000 30000 40000 50000
    User threads (N)
    NOTx/Sec X(N)
    USL Analysis of PG91X
    ! = 0.0385534
    ! = 0.00107257
    R2
    = 0.8687
    Nmax = 29.94
    Xmax = 42999.47
    Xroof = 113434.9
    Z(sec) = NaN
    TS = 604121120
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 27 / 64

    View Slide

  41. Application of USL Postgres
    Postgres: β = 0 analysis for PG 9.1
    0 20 40 60 80
    0 10000 20000 30000 40000 50000
    User threads (N)
    NOTx/Sec X(N)
    USL Analysis of PG91X
    ! = 0.0385534
    ! = 0
    R2
    = 0.8687
    Nmax = NaN
    Xmax = NaN
    Xroof = 113434.9
    Z(sec) = NaN
    TS = 604121128
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 28 / 64

    View Slide

  42. Application of USL Postgres
    Postgres: USL β = 0 projections for PG 9.1
    0 100 200 300 400
    0 50000 100000 150000 200000
    Clients (N)
    TPS X(N)
    USL Projections for PG91X
    PG 9.1 data
    USL scalability profile
    USL max prediction
    PG 9.2FL avg saturation X
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 29 / 64

    View Slide

  43. Superlinearity
    Outline
    1 Review of USL
    2 Application of USL
    Memcache
    Varnish
    Postgres
    3 Superlinearity
    Something for nothing
    Mathematica modeling
    Postgres 9.2FL superlinearity
    4 Summary
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 30 / 64

    View Slide

  44. Superlinearity Something for nothing
    Super Efficiencies
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 31 / 64

    View Slide

  45. Superlinearity Something for nothing
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 32 / 64

    View Slide

  46. Superlinearity Something for nothing
    Recent examples
    Perpetual motion
    Perpetual motion contraptions violate conservation of energy law.
    Super efficiency is tantamount to getting more than 100% of something.
    You know it’s wrong but proving it is usually the harder part.
    a. Z-Torque bicycle crank
    b. Negative Kelvin temperatures
    c. Superluminal neutrinos
    Performance super efficiency
    Superlinear scalability (hardware or software) exhibits measured throughput
    performance that exceeds 100% of available capacity.
    Needs explaining (or debugging).
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 33 / 64

    View Slide

  47. Superlinearity Something for nothing
    a. Z-Torque bicycle crank
    Conjecture (Jan 12, 2013)
    Inventor tries to raise $1000s in start-up capital through crowd funding a
    super-efficient bicycle crank. [Source: Slashdot]
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 34 / 64

    View Slide

  48. Superlinearity Something for nothing
    a. Z-Torque bicycle crank
    Conjecture (Jan 12, 2013)
    Inventor tries to raise $1000s in start-up capital through crowd funding a
    super-efficient bicycle crank. [Source: Slashdot]
    Bug: Bad physics
    Somebody doesn’t understand vector moments.
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 34 / 64

    View Slide

  49. Superlinearity Something for nothing
    b. Negative Kelvin temperatures
    Conjecture (Jan 3, 2013)
    Ultracold potassium gas reaches T < 0 ◦K. Impossible! Published in Nature.
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 35 / 64

    View Slide

  50. Superlinearity Something for nothing
    b. Negative Kelvin temperatures
    Conjecture (Jan 3, 2013)
    Ultracold potassium gas reaches T < 0 ◦K. Impossible! Published in Nature.
    Normal ground state
    Flipped ground state
    Bug: Maybe not
    Depends how you define temperature. Shortly, we’ll see negative time.
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 35 / 64

    View Slide

  51. Superlinearity Something for nothing
    c. Superluminal neutrinos
    Conjecture (Sept 23, 2011)
    Italian OPERA experiment measured LHC neutrinos vν
    > c with 6σ confidence.
    Einstein wrong! Published arXiv.org > hep-ex > arXiv:1109.4897
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 36 / 64

    View Slide

  52. Superlinearity Something for nothing
    c. Superluminal neutrinos
    Conjecture (Sept 23, 2011)
    Italian OPERA experiment measured LHC neutrinos vν
    > c with 6σ confidence.
    Einstein wrong! Published arXiv.org > hep-ex > arXiv:1109.4897
    Bug: Dec 14, 2011
    Screwed by a $0.50 fiber connector not being screwed tight.
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 36 / 64

    View Slide

  53. Superlinearity Something for nothing
    Application superlinearity—This is what it looks like
    0 20 40 60 80
    0 50000 100000 150000 200000
    Clients (N)
    TPS X(N)
    Raw data for PG92flX
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 37 / 64

    View Slide

  54. Superlinearity Something for nothing
    Superlinear efficiencies
    Example (PG 92FL data)
    > input
    N X_N Norm Effcy
    1 1 4439.85 1.000000 1.0000000 << ok
    2 4 17111.29 3.854023 0.9635058
    3 8 33305.86 7.501573 0.9376966
    4 12 47466.03 10.690907 0.8909089
    5 16 61403.72 13.830132 0.8643832
    6 20 73229.07 16.493589 0.8246794
    7 24 97529.10 21.966754 0.9152814 <-- increasing !?
    8 28 143119.87 32.235290 1.1512604 <-- above 100% !?
    9 32 183640.43 41.361849 1.2925578 <-- ?
    10 36 186552.78 42.017808 1.1671613 <-- ?
    11 40 187370.09 42.201892 1.0550473 <-- ?
    12 44 188295.57 42.410340 0.9638714
    13 48 184799.33 41.622873 0.8671432
    14 52 182925.81 41.200895 0.7923249
    15 56 181790.11 40.945098 0.7311625
    16 60 176109.85 39.665717 0.6610953
    17 64 176334.82 39.716388 0.6205686
    18 68 171278.40 38.577516 0.5673164
    19 72 168922.21 38.046825 0.5284281
    20 76 165651.64 37.310185 0.4909235
    21 80 164238.55 36.991910 0.4623989
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 38 / 64

    View Slide

  55. Superlinearity Something for nothing
    Another way to screw everything up
    Median Throughput Comparison
    Threads
    Throughput, NOT/10sec
    0
    2000
    4000
    6000
    8000
    10000
    1 4 16 64 256 1024
    Clustrix ! 3 Nodes
    Clustrix ! 6 Nodes
    Clustrix ! 9 Nodes
    Intel SSD
    HP/FusionIO
    See the problem?
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

    View Slide

  56. Superlinearity Something for nothing
    Another way to screw everything up
    Median Throughput Comparison
    Threads
    Throughput, NOT/10sec
    0
    2000
    4000
    6000
    8000
    10000
    1 4 16 64 256 1024
    Clustrix ! 3 Nodes
    Clustrix ! 6 Nodes
    Clustrix ! 9 Nodes
    Intel SSD
    HP/FusionIO
    See the problem? Don’t use log-linear axes.
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

    View Slide

  57. Superlinearity Something for nothing
    Another way to screw everything up
    Median Throughput Comparison
    Threads
    Throughput, NOT/10sec
    0
    2000
    4000
    6000
    8000
    10000
    1 4 16 64 256 1024
    Clustrix ! 3 Nodes
    Clustrix ! 6 Nodes
    Clustrix ! 9 Nodes
    Intel SSD
    HP/FusionIO
    See the problem? Don’t use log-linear axes. (And certainly not base-2 logs.)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

    View Slide

  58. Superlinearity Something for nothing
    Another way to screw everything up
    Median Throughput Comparison
    Threads
    Throughput, NOT/10sec
    0
    2000
    4000
    6000
    8000
    10000
    1 4 16 64 256 1024
    Clustrix ! 3 Nodes
    Clustrix ! 6 Nodes
    Clustrix ! 9 Nodes
    Intel SSD
    HP/FusionIO
    See the problem? Don’t use log-linear axes. (And certainly not base-2 logs.)
    Without warning the reader ...
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

    View Slide

  59. Superlinearity Something for nothing
    Another way to screw everything up
    Median Throughput Comparison
    Threads
    Throughput, NOT/10sec
    0
    2000
    4000
    6000
    8000
    10000
    1 4 16 64 256 1024
    Clustrix ! 3 Nodes
    Clustrix ! 6 Nodes
    Clustrix ! 9 Nodes
    Intel SSD
    HP/FusionIO
    See the problem? Don’t use log-linear axes. (And certainly not base-2 logs.)
    Without warning the reader ... BIG TIME!
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

    View Slide

  60. Superlinearity Mathematica modeling
    Mathematica Modeling
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 40 / 64

    View Slide

  61. Superlinearity Mathematica modeling
    Generic form of superlinear scaling
    Ê
    Ê
    Ê
    Ê
    Ê
    Ê
    Gradient inflection
    Gradient maximum
    0 5 10 15 20
    0
    5
    10
    15
    20
    General form appears to be:
    Ideal linear slope:
    C(N)/N = 100%
    Data above linear slope:
    C(N)/N > 100%
    Point of inflection
    Otherwise convex upward:
    C(N) → ∞
    Maximum in gradient
    Degradation beyond max
    Is it always like this?
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 41 / 64

    View Slide

  62. Superlinearity Mathematica modeling
    Plausible 3-parameter USL model
    CN
    (α, β, γ) =
    N
    exp(−γ(N − 1)) + α(N − 1) + βN(N − 1)
    (1)
    Ê
    Ê
    Ê
    Ê
    Ê
    Ê
    Neil J. Gunther, Tue 11 Oct 2011
    0 5 10 15 20 25
    N
    0
    5
    10
    15
    CHNL
    USL 3-Parameter Model
    Properties of eqn. (1):
    e−γ(N−1) → 1 as γ → 0
    γ = 0 same as USL
    NLS fit parameters:
    α = 0.001
    β = 0.00425
    γ = 0.1
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 42 / 64

    View Slide

  63. Superlinearity Mathematica modeling
    Parameterized Elephant
    “With four parameters I can fit an elephant. With five I can make his
    trunk wiggle.” —John von Neumann
    params = 1 params = 2
    params = 3 params = 4
    params = 1 params = 2
    params = 3 params = 4
    See my animated blog post: A Winking Pink Elephant
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 43 / 64

    View Slide

  64. Superlinearity Mathematica modeling
    Magic Moment !!!
    CN
    (α, β) =
    N
    1 + α(N − 1) + βN(N − 1)
    (2)
    Ê
    Ê
    Ê
    Ê
    Ê
    Ê
    Neil J. Gunther, Thu 19 Apr 2012
    0 5 10 15 20 25
    N
    0
    5
    10
    15
    CHNL
    USL 2-Parameter Model
    NLS fit parameters:
    α = −0.0859
    β = 0.0064
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 44 / 64

    View Slide

  65. Superlinearity Mathematica modeling
    Magic Moment !!!
    CN
    (α, β) =
    N
    1 + α(N − 1) + βN(N − 1)
    (2)
    Ê
    Ê
    Ê
    Ê
    Ê
    Ê
    Neil J. Gunther, Thu 19 Apr 2012
    0 5 10 15 20 25
    N
    0
    5
    10
    15
    CHNL
    USL 2-Parameter Model
    NLS fit parameters:
    α = −0.0859
    β = 0.0064
    Properties of eqn. (2):
    It’s our fave USL (Hello!)
    But α < 0 allowed
    Capacity credit
    Still have β > 0
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 44 / 64

    View Slide

  66. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  67. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  68. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk but the meeting got cancelled
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  69. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk but the meeting got cancelled
    That means my talk took zero elapsed time (∆ttalk
    = 0)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  70. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk but the meeting got cancelled
    That means my talk took zero elapsed time (∆ttalk
    = 0)
    But that’s assuming I was already in the room
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  71. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk but the meeting got cancelled
    That means my talk took zero elapsed time (∆ttalk
    = 0)
    But that’s assuming I was already in the room
    It was cancelled before I made the trip to the meeting
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  72. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk but the meeting got cancelled
    That means my talk took zero elapsed time (∆ttalk
    = 0)
    But that’s assuming I was already in the room
    It was cancelled before I made the trip to the meeting
    My talk took less than zero time or negative time (∆ttalk < 0)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  73. Superlinearity Mathematica modeling
    The Meaning of Negative α
    A Little Story:
    I was supposed to give a talk but the meeting got cancelled
    That means my talk took zero elapsed time (∆ttalk
    = 0)
    But that’s assuming I was already in the room
    It was cancelled before I made the trip to the meeting
    My talk took less than zero time or negative time (∆ttalk < 0)
    Think of the non-trip time as a time credit
    Proposition (Faster than parallel)
    Negative α induces a negative execution time (i.e., a time credit) due to latent
    additional resources (e.g., more memory or cache) and that translates into
    performance that is faster than parallel.
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

    View Slide

  74. Superlinearity Mathematica modeling
    The Meaning of Negative α in USL
    Initial unit of computing capacity
    p
    C p
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 46 / 64

    View Slide

  75. Superlinearity Mathematica modeling
    Positive α
    Some fraction of original capacity lost to overhead
    Α
    p
    C p
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 47 / 64

    View Slide

  76. Superlinearity Mathematica modeling
    Negative α
    Some fraction of original capacity is added (opposite sign)
    Α
    Α
    p
    C p
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 48 / 64

    View Slide

  77. Superlinearity Mathematica modeling
    Positive α Capacity Scaling
    Growing capacity loss as system is scaled out
    Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
    0 1 2 3 4 5 6
    p
    C p
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 49 / 64

    View Slide

  78. Superlinearity Mathematica modeling
    Negative α Capacity Scaling
    Growing capacity increase as system is scaled out
    Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
    p
    0.5
    0.5
    1.0
    C p
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 50 / 64

    View Slide

  79. Superlinearity Mathematica modeling
    Negative α in the Data
    This is how it would appear in scalability measurements
    0 1 2 3 4 5 6
    p
    0
    1
    2
    3
    4
    5
    6
    C p
    Linear
    0 1 2 3 4 5 6
    p
    0
    1
    2
    3
    4
    5
    6
    C p
    Sublinear
    0 1 2 3 4 5 6
    p
    0
    1
    2
    3
    4
    5
    6
    C p
    Superlinear
    Can generalize this concept to nonlinear scalability
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 51 / 64

    View Slide

  80. Superlinearity Postgres 9.2FL superlinearity
    Postgres 9.2FL Analysis
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 52 / 64

    View Slide

  81. Superlinearity Postgres 9.2FL superlinearity
    Postgres: PG 9.2FL measurements
    0 20 40 60 80
    0 50000 100000 150000 200000
    Clients (N)
    TPS X(N)
    Raw data for PG92flX
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 53 / 64

    View Slide

  82. Superlinearity Postgres 9.2FL superlinearity
    Postgres: PG 9.2FL scalability N ≤ 48
    0 20 40 60 80
    0 50000 100000 150000 200000
    Clients (N)
    TPS X(N)
    USL Analysis of PG92flX
    α = −0.0109191
    β = 0.000257488
    R2
    = 0.9521
    Nmax = 62.66
    Xmax = 210508.7
    Xroof = NaN
    Z(sec) = NaN
    TS = 1604121213
    NJG Mon Apr 16 12:13:47 2012
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 54 / 64

    View Slide

  83. Superlinearity Postgres 9.2FL superlinearity
    Postgres: PG 9.2FL scalability N ≤ 80
    0 20 40 60 80
    0 50000 100000 150000 200000
    Clients (N)
    TPS X(N)
    USL Analysis of PG92flX
    α = −0.0155072
    β = 0.000386942
    R2
    = 0.9579
    Nmax = 51.23
    Xmax = 186930.1
    Xroof = NaN
    Z(sec) = NaN
    TS = 1604121214
    NJG Mon Apr 16 12:14:49 2012
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 55 / 64

    View Slide

  84. Superlinearity Postgres 9.2FL superlinearity
    Superlinear scaling zones
    CN
    (α, β) =
    N
    1 − α(N − 1) + βN(N − 1)
    Superlinear
    Payback
    0 5 10 15 20
    N
    0
    5
    10
    15
    20
    C N
    (a) Data in superlinear zone
    where C(N)/N > 100%
    like perpetual motion
    (b) Data in payback zone
    paying the piper
    sudden degradation
    where C(N)/N 100%
    (c) Is it always like this?
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 56 / 64

    View Slide

  85. Superlinearity Postgres 9.2FL superlinearity
    Superlinear Payback Theorem
    Theorem (Gunther 2012)
    Superlinear scaling in the USL model, with α < 0 and β > 0, always induces capacity
    degradation because the following properties hold:
    1 Superlinear asymptote at:
    Nα =
    α − 1
    α
    2 Inflection point N± is the smallest positive root of:
    ∂2
    N
    Csl (N, −α, β) = N3β2 + (3N − 1)(α − 1)β + (α − 1)α = 0
    3 Capacity maximum at:
    Nmax =
    1 − α
    β
    4 ∀N > N±, superlinear capacity Csl (N) crosses the linear bound at:
    Nx =
    α
    β
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 57 / 64

    View Slide

  86. Superlinearity Postgres 9.2FL superlinearity
    Visual proof: Superlinear asymptote
    N
    C N
    N
    C N
    Proof.
    Linear bound: C(N)/N = 1 (dashed line)
    Super efficient region: Csl (N)/N > 1
    Superlinear segment curved upward by α < 0 (convex function)
    Asymptote at N = Nα (vertical line) where Csl (N, −α) → ∞
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 58 / 64

    View Slide

  87. Superlinearity Postgres 9.2FL superlinearity
    Visual proof: Upper bound and Saturation
    N
    C N
    N
    C N
    Proof.
    A physical capacity bound must exist (dashed horizontal line)
    Csl (N) scaling curve will saturate below that bound (2nd red segment)
    That saturation segment must cross linear bound at Nx
    Therefore, must be an inflection point in Csl (N) at N± < Nx
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 59 / 64

    View Slide

  88. Superlinearity Postgres 9.2FL superlinearity
    Visual proof: Inflection, Crossing and Degradation
    N
    C N
    N
    C N
    Proof.
    Inflection point N± joins superlinear and saturation segments
    Csl (N) crosses linear bound at Nx = |α/β|
    Since α < 0, crossing can only arise from coherency term with β > 0
    Hence, superlinearity always induces coherency roll off (payback)
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 60 / 64

    View Slide

  89. Superlinearity Postgres 9.2FL superlinearity
    Payback parameters for PG 9.2FL
    0 20 40 60 80
    0 50000 100000 150000 200000
    Clients (N)
    TPS X(N)
    USL Analysis of PG92flX
    α = −0.0155072
    β = 0.000386942
    R2
    = 0.9579
    Nmax = 51.23
    Xmax = 186930.1
    Xroof = NaN
    Z(sec) = NaN
    TS = 1604121214
    NJG Mon Apr 16 12:14:49 2012
    α = −0.0155, β = 0.000387

    = 14.0351
    Nx = 40.0517
    Nmax = 51.2253

    = 65.5161
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 61 / 64

    View Slide

  90. Summary
    Outline
    1 Review of USL
    2 Application of USL
    Memcache
    Varnish
    Postgres
    3 Superlinearity
    Something for nothing
    Mathematica modeling
    Postgres 9.2FL superlinearity
    4 Summary
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 62 / 64

    View Slide

  91. Summary
    Summary
    USL is 2-parameter scalability model C(N, α, β)
    Requires α, β > 0 for C(N) to be concave function
    Superlinear measurements C(N)/N > 1 do exist
    Extra fitting parameter C(N, α, β, γ) ⇒ JvN elephants
    Discovered superlinear USL with α < 0
    Super-efficiencies are not free
    Like perpetual motion:
    no free lunch
    pay the piper eventually
    debugging it is the hard part
    Thm: Superlinearity always followed by capacity degradation
    More (Oracle ???) superlinear measurements would be good
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 63 / 64

    View Slide

  92. Summary
    Thank you for attending!
    Castro Valley, California
    www.perfdynamics.com
    perfdynamics.blogspot.com
    Twitter/DrQz
    Facebook
    [email protected]
    +1-510-537-5758
    c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 64 / 64

    View Slide