Slide 1

Slide 1 text

Superlinear Speedup The Perpetual Motion of Parallel Performance Dr. Neil Gunther Performance Dynamics Hotsos Symposium March 5, 2013 SM c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 1 / 64

Slide 2

Slide 2 text

Outline Quick review 20 years of USL scalability analysis Appearance of “super linear” data starting c. 2010: Some users complain USL doesn’t work for superlinearity! But precious little correct data (e.g., none on Wikipedia) Likely to see more superlinearity in distributed systems Can’t just ignore it or people will abandon USL Super linear speedup described on Wikipedia (must be true) Add 3rd parameter to USL: To fit superlinear data Headache the size of an elephant April 2012 discovered stunningly simple result No modification to USL equation (Huh?) Ramifications for scalability analysis are quite profound Like perpetual motion, if it’s too good to be true... c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 2 / 64

Slide 3

Slide 3 text

Review of USL Outline 1 Review of USL 2 Application of USL Memcache Varnish Postgres 3 Superlinearity Something for nothing Mathematica modeling Postgres 9.2FL superlinearity 4 Summary c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 3 / 64

Slide 4

Slide 4 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 5

Slide 5 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 6

Slide 6 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals” c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 7

Slide 7 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals” Hotsos 2011: “Brooks, Cooks, and Response Time Scalability” c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 8

Slide 8 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals” Hotsos 2011: “Brooks, Cooks, and Response Time Scalability” Equal bang for the buck: linear concurrency c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 9

Slide 9 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals” Hotsos 2011: “Brooks, Cooks, and Response Time Scalability” Equal bang for the buck: linear concurrency Diminishing Returns: contention overhead c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 10

Slide 10 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals” Hotsos 2011: “Brooks, Cooks, and Response Time Scalability” Equal bang for the buck: linear concurrency Diminishing Returns: contention overhead Negative return on investment: coherency overhead c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 11

Slide 11 text

Review of USL How to Quantify Scalability Previous USL presentations at Hotsos: Hotsos 2007: “Guerrilla Scalability: How To Do Virtual Load Testing” Hotsos 2010: “How to Quantify Oracle Database Scalability: Fundamentals” Hotsos 2011: “Brooks, Cooks, and Response Time Scalability” Equal bang for the buck: linear concurrency Diminishing Returns: contention overhead Negative return on investment: coherency overhead Calculate scalability curve from performance measurements c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 4 / 64

Slide 12

Slide 12 text

Review of USL Also ended up in my books Chapters 6 and 14 Chapters 4–6 Also check out: Special USL web page Guerrilla perf and CaP classes c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 5 / 64

Slide 13

Slide 13 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 14

Slide 14 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load C(N) relative capacity function of N c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 15

Slide 15 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load C(N) relative capacity function of N But what function? CN(α, β) = N 1 + α (N − 1) + β N(N − 1) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 16

Slide 16 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load C(N) relative capacity function of N But what function? CN(α, β) = N 1 + α (N − 1) + β N(N − 1) Three Cs: c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 17

Slide 17 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load C(N) relative capacity function of N But what function? CN(α, β) = N 1 + α (N − 1) + β N(N − 1) Three Cs: 1 Concurrency c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 18

Slide 18 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load C(N) relative capacity function of N But what function? CN(α, β) = N 1 + α (N − 1) + β N(N − 1) Three Cs: 1 Concurrency 2 Contention (0 < α < 1) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 19

Slide 19 text

Review of USL Universal Scalability Law (USL) N virtual users or processes provide load C(N) relative capacity function of N But what function? CN(α, β) = N 1 + α (N − 1) + β N(N − 1) Three Cs: 1 Concurrency 2 Contention (0 < α < 1) 3 Coherency (0 < β < 1) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 6 / 64

Slide 20

Slide 20 text

Review of USL Concave shape of USL function Xdata(N) Xdata(1) → CN(α, β) = N 1 + α(N − 1) + βN(N − 1) 0 2 4 6 8 10 N 0.2 0.4 0.6 0.8 1.0 1.2 1.4 C Α,Β Handles scalability degradation (universal) Goal is to get rid of scalability maximum c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 7 / 64

Slide 21

Slide 21 text

Review of USL How do we determine α and β? C(N) = N 1 + α (N − 1) + β N(N − 1) Gene Amdahl (1967): brute force measurement for α Clever way: Apply statistical regression I will use R: FOSS package with 40 yr history (since S at Bell Labs) Sophisticated/accurate statistical tools Interpreted programming language (cf. Mathematica) Magic functions in R: nls() nonlinear LSQ fit (α, β in one swell foop) optimize() to estimate X(1) if missing predict() for smooth interpolation/extrapolation from data plot() with many variants c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 8 / 64

Slide 22

Slide 22 text

Application of USL Outline 1 Review of USL 2 Application of USL Memcache Varnish Postgres 3 Superlinearity Something for nothing Mathematica modeling Postgres 9.2FL superlinearity 4 Summary c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 9 / 64

Slide 23

Slide 23 text

Application of USL Memcache Memcache Joint work with S. Subramanyam (Sun, USA) and S. Parvu (Sun, FI) Presented at Velocity 2010 conference c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 10 / 64

Slide 24

Slide 24 text

Application of USL Memcache Memcache Scalability Scaleup Scaleout c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 11 / 64

Slide 25

Slide 25 text

Application of USL Memcache Memcache: Scaleout strategy Distributed cache of key-value pairs Pre-loaded from RDBMS Tier of cheap, older CPUs (e.g., not multicore) Single threading ok, until next hardware roll c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 12 / 64

Slide 26

Slide 26 text

Application of USL Memcache Memcache: measurements Example (Read in raw data and plot it) input <- read.table(fname,header=TRUE,sep="\t") print(input) plot(input$N,input$X_N,type="b") 2 4 6 8 10 12 14 100 150 200 250 300 350 Raw data for memcached 132 input$N input$X_N Typing input into R console: > input N X_N 1 1 89 2 2 160 3 4 272 4 8 333 5 10 352 6 12 339 7 14 315 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 13 / 64

Slide 27

Slide 27 text

Application of USL Memcache Memcache: nonlinear regression Example (Normalize, check efficiencies, fit USL) > input N X_N Norm Effcy 1 1 89 1.000000 1.0000000 2 2 160 1.797753 0.8988764 3 4 272 3.056180 0.7640449 4 8 333 3.741573 0.4676966 5 10 352 3.955056 0.3955056 6 12 339 3.808989 0.3174157 7 14 315 3.539326 0.2528090 Formula: Norm ˜ N/(1 + alpha * (N - 1) + beta * N * (N - 1)) Parameters: Estimate Std. Error t value Pr(>|t|) alpha 0.063520 0.011433 5.556 0.002597 ** beta 0.011323 0.001063 10.649 0.000126 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.07824 on 5 degrees of freedom Algorithm "port", convergence message: relative convergence (4) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 14 / 64

Slide 28

Slide 28 text

Application of USL Memcache Memcache: scalability analysis 0 2 4 6 8 10 12 14 0 50 100 150 200 250 300 350 Threads (N) Throughput X(N) in KOps/s USL Scalability Analysis of 'memcached 132' Data ! = 0.0635 ! = 0.011323 R2 = 0.9961 Nmax = 9.09 Xmax = 344.76 Xroof = 1401.13 Z(sec) = 0 TS = 3001131110 Created by NJG on Wed Jan 30 11:10:32 2013 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 15 / 64

Slide 29

Slide 29 text

Application of USL Varnish Varnish Data by D. Popa (DigitAir, RO) via S. Parvu (Nokia, FI) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 16 / 64

Slide 30

Slide 30 text

Application of USL Varnish Varnish: architecture HTTP accelerator Reverse web proxy caching system Sits in front of classic web server Caching handled by virtual memory Highly scalable (linear) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 17 / 64

Slide 31

Slide 31 text

Application of USL Varnish Varnish: measurements Example (Read in raw data and plot it) input <- read.table(fname,header=TRUE,sep="\t") print(input) plot(input$N,input$X_N,type="b") 0 100 200 300 400 0 100 200 300 400 500 Raw data: Varnish input$N input$X_N By typing input into R console: > input N X_N 1 1 1.4 2 2 2.7 3 5 6.4 4 10 12.8 5 25 32.0 6 50 64.0 7 75 98.0 8 100 131.0 9 150 197.0 10 250 320.0 11 300 392.0 12 400 518.0 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 18 / 64

Slide 32

Slide 32 text

Application of USL Varnish Varnish: nonlinear regression Example (Fit to USL model) N X_N Norm Effcy 1 1 1.4 1.000000 1.0000000 2 2 2.7 1.928571 0.9642857 3 5 6.4 4.571429 0.9142857 4 10 12.8 9.142857 0.9142857 5 25 32.0 22.857143 0.9142857 6 50 64.0 45.714286 0.9142857 7 75 98.0 70.000000 0.9333333 8 100 131.0 93.571429 0.9357143 9 150 197.0 140.714286 0.9380952 10 250 320.0 228.571429 0.9142857 11 300 392.0 280.000000 0.9333333 12 400 518.0 370.000000 0.9250000 Formula: Norm ˜ N/(1 + alpha * (N - 1) + beta * N * (N - 1)) Parameters: Estimate Std. Error t value Pr(>|t|) alpha 5.721e-04 7.220e-05 7.924 1.28e-05 *** beta -9.414e-07 1.978e-07 -4.759 0.000769 *** <<<<<<< beta < 0 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 2.078 on 10 degrees of freedom Number of iterations to convergence: 11 Achieved convergence tolerance: 1.199e-07 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 19 / 64

Slide 33

Slide 33 text

Application of USL Varnish Varnish: USL analysis with β < 0 0 100 200 300 400 0 100 200 300 400 500 USL Fit to Varnish Load (N) Throughput X(N) ! = 6e-04 ! = !1e-06 R2 = 0.9997 Nmax = NaN Xmax = NaN Xroof = 2447.13 Z(sec) = 0 TS = 3001131837 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 20 / 64

Slide 34

Slide 34 text

Application of USL Varnish Varnish: USL convex projection 0 200 400 600 800 1000 0 500 1000 1500 2000 USL bogus projection for Varnish Load (N) Throughput X(N) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 21 / 64

Slide 35

Slide 35 text

Application of USL Varnish Varnish: nonlinear regression Example (Fit to β = 0 model) > input N X_N Norm Effcy 1 1 1.4 1.000000 1.0000000 2 2 2.7 1.928571 0.9642857 3 5 6.4 4.571429 0.9142857 4 10 12.8 9.142857 0.9142857 5 25 32.0 22.857143 0.9142857 6 50 64.0 45.714286 0.9142857 7 75 98.0 70.000000 0.9333333 8 100 131.0 93.571429 0.9357143 9 150 197.0 140.714286 0.9380952 10 250 320.0 228.571429 0.9142857 11 300 392.0 280.000000 0.9333333 12 400 518.0 370.000000 0.9250000 Formula: Norm ˜ N/(1 + alpha * (N - 1)) <<<<<<< beta=0 model **** Parameters: Estimate Std. Error t value Pr(>|t|) alpha 0.0002361 0.0000218 10.84 3.3e-07 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 3.617 on 11 degrees of freedom Number of iterations to convergence: 5 Achieved convergence tolerance: 9.72e-08 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 22 / 64

Slide 36

Slide 36 text

Application of USL Varnish Varnish: USL β = 0 analysis 0 100 200 300 400 0 100 200 300 400 500 USL Fit to Varnish Load (N) Throughput X(N) ! = 2e-04 ! = 0 R2 = 0.9992 Nmax = NaN Xmax = NaN Xroof = 5928.53 Z(sec) = NaN TS = 3001131618 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 23 / 64

Slide 37

Slide 37 text

Application of USL Varnish Varnish: concave scalability projections 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 6000 USL Projections for Varnish Load (N) Throughput X(N) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 24 / 64

Slide 38

Slide 38 text

Application of USL Postgres Postgres Data via R. Haas (EnterpriseDB, MA) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 25 / 64

Slide 39

Slide 39 text

Application of USL Postgres Postgres: PG 9.x measurements c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 26 / 64

Slide 40

Slide 40 text

Application of USL Postgres Postgres: PG 9.x scalability analysis 0 20 40 60 80 0 10000 20000 30000 40000 50000 User threads (N) NOTx/Sec X(N) USL Analysis of PG91X ! = 0.0385534 ! = 0.00107257 R2 = 0.8687 Nmax = 29.94 Xmax = 42999.47 Xroof = 113434.9 Z(sec) = NaN TS = 604121120 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 27 / 64

Slide 41

Slide 41 text

Application of USL Postgres Postgres: β = 0 analysis for PG 9.1 0 20 40 60 80 0 10000 20000 30000 40000 50000 User threads (N) NOTx/Sec X(N) USL Analysis of PG91X ! = 0.0385534 ! = 0 R2 = 0.8687 Nmax = NaN Xmax = NaN Xroof = 113434.9 Z(sec) = NaN TS = 604121128 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 28 / 64

Slide 42

Slide 42 text

Application of USL Postgres Postgres: USL β = 0 projections for PG 9.1 0 100 200 300 400 0 50000 100000 150000 200000 Clients (N) TPS X(N) USL Projections for PG91X PG 9.1 data USL scalability profile USL max prediction PG 9.2FL avg saturation X c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 29 / 64

Slide 43

Slide 43 text

Superlinearity Outline 1 Review of USL 2 Application of USL Memcache Varnish Postgres 3 Superlinearity Something for nothing Mathematica modeling Postgres 9.2FL superlinearity 4 Summary c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 30 / 64

Slide 44

Slide 44 text

Superlinearity Something for nothing Super Efficiencies c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 31 / 64

Slide 45

Slide 45 text

Superlinearity Something for nothing c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 32 / 64

Slide 46

Slide 46 text

Superlinearity Something for nothing Recent examples Perpetual motion Perpetual motion contraptions violate conservation of energy law. Super efficiency is tantamount to getting more than 100% of something. You know it’s wrong but proving it is usually the harder part. a. Z-Torque bicycle crank b. Negative Kelvin temperatures c. Superluminal neutrinos Performance super efficiency Superlinear scalability (hardware or software) exhibits measured throughput performance that exceeds 100% of available capacity. Needs explaining (or debugging). c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 33 / 64

Slide 47

Slide 47 text

Superlinearity Something for nothing a. Z-Torque bicycle crank Conjecture (Jan 12, 2013) Inventor tries to raise $1000s in start-up capital through crowd funding a super-efficient bicycle crank. [Source: Slashdot] c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 34 / 64

Slide 48

Slide 48 text

Superlinearity Something for nothing a. Z-Torque bicycle crank Conjecture (Jan 12, 2013) Inventor tries to raise $1000s in start-up capital through crowd funding a super-efficient bicycle crank. [Source: Slashdot] Bug: Bad physics Somebody doesn’t understand vector moments. c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 34 / 64

Slide 49

Slide 49 text

Superlinearity Something for nothing b. Negative Kelvin temperatures Conjecture (Jan 3, 2013) Ultracold potassium gas reaches T < 0 ◦K. Impossible! Published in Nature. c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 35 / 64

Slide 50

Slide 50 text

Superlinearity Something for nothing b. Negative Kelvin temperatures Conjecture (Jan 3, 2013) Ultracold potassium gas reaches T < 0 ◦K. Impossible! Published in Nature. Normal ground state Flipped ground state Bug: Maybe not Depends how you define temperature. Shortly, we’ll see negative time. c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 35 / 64

Slide 51

Slide 51 text

Superlinearity Something for nothing c. Superluminal neutrinos Conjecture (Sept 23, 2011) Italian OPERA experiment measured LHC neutrinos vν > c with 6σ confidence. Einstein wrong! Published arXiv.org > hep-ex > arXiv:1109.4897 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 36 / 64

Slide 52

Slide 52 text

Superlinearity Something for nothing c. Superluminal neutrinos Conjecture (Sept 23, 2011) Italian OPERA experiment measured LHC neutrinos vν > c with 6σ confidence. Einstein wrong! Published arXiv.org > hep-ex > arXiv:1109.4897 Bug: Dec 14, 2011 Screwed by a $0.50 fiber connector not being screwed tight. c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 36 / 64

Slide 53

Slide 53 text

Superlinearity Something for nothing Application superlinearity—This is what it looks like 0 20 40 60 80 0 50000 100000 150000 200000 Clients (N) TPS X(N) Raw data for PG92flX c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 37 / 64

Slide 54

Slide 54 text

Superlinearity Something for nothing Superlinear efficiencies Example (PG 92FL data) > input N X_N Norm Effcy 1 1 4439.85 1.000000 1.0000000 << ok 2 4 17111.29 3.854023 0.9635058 3 8 33305.86 7.501573 0.9376966 4 12 47466.03 10.690907 0.8909089 5 16 61403.72 13.830132 0.8643832 6 20 73229.07 16.493589 0.8246794 7 24 97529.10 21.966754 0.9152814 <-- increasing !? 8 28 143119.87 32.235290 1.1512604 <-- above 100% !? 9 32 183640.43 41.361849 1.2925578 <-- ? 10 36 186552.78 42.017808 1.1671613 <-- ? 11 40 187370.09 42.201892 1.0550473 <-- ? 12 44 188295.57 42.410340 0.9638714 13 48 184799.33 41.622873 0.8671432 14 52 182925.81 41.200895 0.7923249 15 56 181790.11 40.945098 0.7311625 16 60 176109.85 39.665717 0.6610953 17 64 176334.82 39.716388 0.6205686 18 68 171278.40 38.577516 0.5673164 19 72 168922.21 38.046825 0.5284281 20 76 165651.64 37.310185 0.4909235 21 80 164238.55 36.991910 0.4623989 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 38 / 64

Slide 55

Slide 55 text

Superlinearity Something for nothing Another way to screw everything up Median Throughput Comparison Threads Throughput, NOT/10sec 0 2000 4000 6000 8000 10000 1 4 16 64 256 1024 Clustrix ! 3 Nodes Clustrix ! 6 Nodes Clustrix ! 9 Nodes Intel SSD HP/FusionIO See the problem? c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

Slide 56

Slide 56 text

Superlinearity Something for nothing Another way to screw everything up Median Throughput Comparison Threads Throughput, NOT/10sec 0 2000 4000 6000 8000 10000 1 4 16 64 256 1024 Clustrix ! 3 Nodes Clustrix ! 6 Nodes Clustrix ! 9 Nodes Intel SSD HP/FusionIO See the problem? Don’t use log-linear axes. c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

Slide 57

Slide 57 text

Superlinearity Something for nothing Another way to screw everything up Median Throughput Comparison Threads Throughput, NOT/10sec 0 2000 4000 6000 8000 10000 1 4 16 64 256 1024 Clustrix ! 3 Nodes Clustrix ! 6 Nodes Clustrix ! 9 Nodes Intel SSD HP/FusionIO See the problem? Don’t use log-linear axes. (And certainly not base-2 logs.) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

Slide 58

Slide 58 text

Superlinearity Something for nothing Another way to screw everything up Median Throughput Comparison Threads Throughput, NOT/10sec 0 2000 4000 6000 8000 10000 1 4 16 64 256 1024 Clustrix ! 3 Nodes Clustrix ! 6 Nodes Clustrix ! 9 Nodes Intel SSD HP/FusionIO See the problem? Don’t use log-linear axes. (And certainly not base-2 logs.) Without warning the reader ... c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

Slide 59

Slide 59 text

Superlinearity Something for nothing Another way to screw everything up Median Throughput Comparison Threads Throughput, NOT/10sec 0 2000 4000 6000 8000 10000 1 4 16 64 256 1024 Clustrix ! 3 Nodes Clustrix ! 6 Nodes Clustrix ! 9 Nodes Intel SSD HP/FusionIO See the problem? Don’t use log-linear axes. (And certainly not base-2 logs.) Without warning the reader ... BIG TIME! c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 39 / 64

Slide 60

Slide 60 text

Superlinearity Mathematica modeling Mathematica Modeling c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 40 / 64

Slide 61

Slide 61 text

Superlinearity Mathematica modeling Generic form of superlinear scaling Ê Ê Ê Ê Ê Ê Gradient inflection Gradient maximum 0 5 10 15 20 0 5 10 15 20 General form appears to be: Ideal linear slope: C(N)/N = 100% Data above linear slope: C(N)/N > 100% Point of inflection Otherwise convex upward: C(N) → ∞ Maximum in gradient Degradation beyond max Is it always like this? c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 41 / 64

Slide 62

Slide 62 text

Superlinearity Mathematica modeling Plausible 3-parameter USL model CN (α, β, γ) = N exp(−γ(N − 1)) + α(N − 1) + βN(N − 1) (1) Ê Ê Ê Ê Ê Ê Neil J. Gunther, Tue 11 Oct 2011 0 5 10 15 20 25 N 0 5 10 15 CHNL USL 3-Parameter Model Properties of eqn. (1): e−γ(N−1) → 1 as γ → 0 γ = 0 same as USL NLS fit parameters: α = 0.001 β = 0.00425 γ = 0.1 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 42 / 64

Slide 63

Slide 63 text

Superlinearity Mathematica modeling Parameterized Elephant “With four parameters I can fit an elephant. With five I can make his trunk wiggle.” —John von Neumann params = 1 params = 2 params = 3 params = 4 params = 1 params = 2 params = 3 params = 4 See my animated blog post: A Winking Pink Elephant c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 43 / 64

Slide 64

Slide 64 text

Superlinearity Mathematica modeling Magic Moment !!! CN (α, β) = N 1 + α(N − 1) + βN(N − 1) (2) Ê Ê Ê Ê Ê Ê Neil J. Gunther, Thu 19 Apr 2012 0 5 10 15 20 25 N 0 5 10 15 CHNL USL 2-Parameter Model NLS fit parameters: α = −0.0859 β = 0.0064 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 44 / 64

Slide 65

Slide 65 text

Superlinearity Mathematica modeling Magic Moment !!! CN (α, β) = N 1 + α(N − 1) + βN(N − 1) (2) Ê Ê Ê Ê Ê Ê Neil J. Gunther, Thu 19 Apr 2012 0 5 10 15 20 25 N 0 5 10 15 CHNL USL 2-Parameter Model NLS fit parameters: α = −0.0859 β = 0.0064 Properties of eqn. (2): It’s our fave USL (Hello!) But α < 0 allowed Capacity credit Still have β > 0 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 44 / 64

Slide 66

Slide 66 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 67

Slide 67 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 68

Slide 68 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk but the meeting got cancelled c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 69

Slide 69 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk but the meeting got cancelled That means my talk took zero elapsed time (∆ttalk = 0) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 70

Slide 70 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk but the meeting got cancelled That means my talk took zero elapsed time (∆ttalk = 0) But that’s assuming I was already in the room c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 71

Slide 71 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk but the meeting got cancelled That means my talk took zero elapsed time (∆ttalk = 0) But that’s assuming I was already in the room It was cancelled before I made the trip to the meeting c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 72

Slide 72 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk but the meeting got cancelled That means my talk took zero elapsed time (∆ttalk = 0) But that’s assuming I was already in the room It was cancelled before I made the trip to the meeting My talk took less than zero time or negative time (∆ttalk < 0) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 73

Slide 73 text

Superlinearity Mathematica modeling The Meaning of Negative α A Little Story: I was supposed to give a talk but the meeting got cancelled That means my talk took zero elapsed time (∆ttalk = 0) But that’s assuming I was already in the room It was cancelled before I made the trip to the meeting My talk took less than zero time or negative time (∆ttalk < 0) Think of the non-trip time as a time credit Proposition (Faster than parallel) Negative α induces a negative execution time (i.e., a time credit) due to latent additional resources (e.g., more memory or cache) and that translates into performance that is faster than parallel. c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 45 / 64

Slide 74

Slide 74 text

Superlinearity Mathematica modeling The Meaning of Negative α in USL Initial unit of computing capacity p C p c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 46 / 64

Slide 75

Slide 75 text

Superlinearity Mathematica modeling Positive α Some fraction of original capacity lost to overhead Α p C p c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 47 / 64

Slide 76

Slide 76 text

Superlinearity Mathematica modeling Negative α Some fraction of original capacity is added (opposite sign) Α Α p C p c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 48 / 64

Slide 77

Slide 77 text

Superlinearity Mathematica modeling Positive α Capacity Scaling Growing capacity loss as system is scaled out Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 0 1 2 3 4 5 6 p C p c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 49 / 64

Slide 78

Slide 78 text

Superlinearity Mathematica modeling Negative α Capacity Scaling Growing capacity increase as system is scaled out Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 p 0.5 0.5 1.0 C p c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 50 / 64

Slide 79

Slide 79 text

Superlinearity Mathematica modeling Negative α in the Data This is how it would appear in scalability measurements 0 1 2 3 4 5 6 p 0 1 2 3 4 5 6 C p Linear 0 1 2 3 4 5 6 p 0 1 2 3 4 5 6 C p Sublinear 0 1 2 3 4 5 6 p 0 1 2 3 4 5 6 C p Superlinear Can generalize this concept to nonlinear scalability c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 51 / 64

Slide 80

Slide 80 text

Superlinearity Postgres 9.2FL superlinearity Postgres 9.2FL Analysis c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 52 / 64

Slide 81

Slide 81 text

Superlinearity Postgres 9.2FL superlinearity Postgres: PG 9.2FL measurements 0 20 40 60 80 0 50000 100000 150000 200000 Clients (N) TPS X(N) Raw data for PG92flX c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 53 / 64

Slide 82

Slide 82 text

Superlinearity Postgres 9.2FL superlinearity Postgres: PG 9.2FL scalability N ≤ 48 0 20 40 60 80 0 50000 100000 150000 200000 Clients (N) TPS X(N) USL Analysis of PG92flX α = −0.0109191 β = 0.000257488 R2 = 0.9521 Nmax = 62.66 Xmax = 210508.7 Xroof = NaN Z(sec) = NaN TS = 1604121213 NJG Mon Apr 16 12:13:47 2012 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 54 / 64

Slide 83

Slide 83 text

Superlinearity Postgres 9.2FL superlinearity Postgres: PG 9.2FL scalability N ≤ 80 0 20 40 60 80 0 50000 100000 150000 200000 Clients (N) TPS X(N) USL Analysis of PG92flX α = −0.0155072 β = 0.000386942 R2 = 0.9579 Nmax = 51.23 Xmax = 186930.1 Xroof = NaN Z(sec) = NaN TS = 1604121214 NJG Mon Apr 16 12:14:49 2012 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 55 / 64

Slide 84

Slide 84 text

Superlinearity Postgres 9.2FL superlinearity Superlinear scaling zones CN (α, β) = N 1 − α(N − 1) + βN(N − 1) Superlinear Payback 0 5 10 15 20 N 0 5 10 15 20 C N (a) Data in superlinear zone where C(N)/N > 100% like perpetual motion (b) Data in payback zone paying the piper sudden degradation where C(N)/N 100% (c) Is it always like this? c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 56 / 64

Slide 85

Slide 85 text

Superlinearity Postgres 9.2FL superlinearity Superlinear Payback Theorem Theorem (Gunther 2012) Superlinear scaling in the USL model, with α < 0 and β > 0, always induces capacity degradation because the following properties hold: 1 Superlinear asymptote at: Nα = α − 1 α 2 Inflection point N± is the smallest positive root of: ∂2 N Csl (N, −α, β) = N3β2 + (3N − 1)(α − 1)β + (α − 1)α = 0 3 Capacity maximum at: Nmax = 1 − α β 4 ∀N > N±, superlinear capacity Csl (N) crosses the linear bound at: Nx = α β c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 57 / 64

Slide 86

Slide 86 text

Superlinearity Postgres 9.2FL superlinearity Visual proof: Superlinear asymptote N C N N C N Proof. Linear bound: C(N)/N = 1 (dashed line) Super efficient region: Csl (N)/N > 1 Superlinear segment curved upward by α < 0 (convex function) Asymptote at N = Nα (vertical line) where Csl (N, −α) → ∞ c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 58 / 64

Slide 87

Slide 87 text

Superlinearity Postgres 9.2FL superlinearity Visual proof: Upper bound and Saturation N C N N C N Proof. A physical capacity bound must exist (dashed horizontal line) Csl (N) scaling curve will saturate below that bound (2nd red segment) That saturation segment must cross linear bound at Nx Therefore, must be an inflection point in Csl (N) at N± < Nx c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 59 / 64

Slide 88

Slide 88 text

Superlinearity Postgres 9.2FL superlinearity Visual proof: Inflection, Crossing and Degradation N C N N C N Proof. Inflection point N± joins superlinear and saturation segments Csl (N) crosses linear bound at Nx = |α/β| Since α < 0, crossing can only arise from coherency term with β > 0 Hence, superlinearity always induces coherency roll off (payback) c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 60 / 64

Slide 89

Slide 89 text

Superlinearity Postgres 9.2FL superlinearity Payback parameters for PG 9.2FL 0 20 40 60 80 0 50000 100000 150000 200000 Clients (N) TPS X(N) USL Analysis of PG92flX α = −0.0155072 β = 0.000386942 R2 = 0.9579 Nmax = 51.23 Xmax = 186930.1 Xroof = NaN Z(sec) = NaN TS = 1604121214 NJG Mon Apr 16 12:14:49 2012 α = −0.0155, β = 0.000387 N± = 14.0351 Nx = 40.0517 Nmax = 51.2253 Nα = 65.5161 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 61 / 64

Slide 90

Slide 90 text

Summary Outline 1 Review of USL 2 Application of USL Memcache Varnish Postgres 3 Superlinearity Something for nothing Mathematica modeling Postgres 9.2FL superlinearity 4 Summary c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 62 / 64

Slide 91

Slide 91 text

Summary Summary USL is 2-parameter scalability model C(N, α, β) Requires α, β > 0 for C(N) to be concave function Superlinear measurements C(N)/N > 1 do exist Extra fitting parameter C(N, α, β, γ) ⇒ JvN elephants Discovered superlinear USL with α < 0 Super-efficiencies are not free Like perpetual motion: no free lunch pay the piper eventually debugging it is the hard part Thm: Superlinearity always followed by capacity degradation More (Oracle ???) superlinear measurements would be good c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 63 / 64

Slide 92

Slide 92 text

Summary Thank you for attending! Castro Valley, California www.perfdynamics.com perfdynamics.blogspot.com Twitter/DrQz Facebook [email protected] +1-510-537-5758 c 2014 Performance Dynamics Superlinear Speedup October 15, 2014 64 / 64