Slide 1

Slide 1 text

Mo e Than Monitoring #monitoring ++ Neil Gunther Performance Dynamics Monitorama Keynote Boston, March 28 2013 SM c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 1 / 47

Slide 2

Slide 2 text

Let’s Get Calibrated about Data Outline 1 Let’s Get Calibrated about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 2 / 47

Slide 3

Slide 3 text

Let’s Get Calibrated about Data Guerrilla Mantra: All data is wrong by definition Measurement is a process, not math. All data contains measurement errors. How big are they and can you tolerate them? Treating data as divine is a sin. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47

Slide 4

Slide 4 text

Let’s Get Calibrated about Data Guerrilla Mantra: All data is wrong by definition Measurement is a process, not math. All data contains measurement errors. How big are they and can you tolerate them? Treating data as divine is a sin. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47

Slide 5

Slide 5 text

Let’s Get Calibrated about Data Guerrilla Mantra: VAMOOS your data doubts Visualize Analyze Modelize Over and Over until Satisfied c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47

Slide 6

Slide 6 text

Let’s Get Calibrated about Data Guerrilla Mantra: VAMOOS your data doubts Visualize Analyze Modelize Over and Over until Satisfied c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47

Slide 7

Slide 7 text

Let’s Get Calibrated about Data Guerrilla Mantra: There are only 3 performance metrics 1 Time, e.g., cpu_ticks 2 Rate (inverse time), e.g., httpGets/s, 3 Number or count, e.g., RSS c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47

Slide 8

Slide 8 text

Let’s Get Calibrated about Data Guerrilla Mantra: There are only 3 performance metrics 1 Time, e.g., cpu_ticks 2 Rate (inverse time), e.g., httpGets/s, 3 Number or count, e.g., RSS c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47

Slide 9

Slide 9 text

Let’s Get Calibrated about Data Watch Out for Patterns I mean that in a bad way. Your brain can’t help itself. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 6 / 47

Slide 10

Slide 10 text

Potted History of Monitoring Outline 1 Let’s Get Calibrated about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 7 / 47

Slide 11

Slide 11 text

Potted History of Monitoring Old Adage: “Nothing New in Computer Science” Mainframes didn’t need real-time monitoring. Batch processing. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 8 / 47

Slide 12

Slide 12 text

Potted History of Monitoring How You Programmed It c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 9 / 47

Slide 13

Slide 13 text

Potted History of Monitoring Later ... the interface improved c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 10 / 47

Slide 14

Slide 14 text

Potted History of Monitoring CTSS (Compatible Time-Sharing System) developed in 1961 at MIT on IBM 7094. Compatible meant compatibility with the standard IBM batch processing O/S. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 11 / 47

Slide 15

Slide 15 text

Potted History of Monitoring Multics Instrumentation c.1965 Multics was a multiuser O/S following CTSS time-share. The Implementation “a rough measure of response time for a time-sharing console user, an exponential average of the number of users in the highest priority scheduling queue is continuously maintained. An integrator, L, initially zero, is updated periodically by the formula L ← L × m + Nq where Nq is the measured length of the scheduling queue at the instant of update, and m is an exponential damping constant” This equation is an iterative form of exponentially damped moving average. In modern terminology, it’s a data smoother. The Lesson “experience with Multics, and earlier with CTSS, shows that building permanent instrumentation into key supervisor modules is well worth the effort, since the cost of maintaining well-organized instrumentation is low, and the payoff is very high.” c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 12 / 47

Slide 16

Slide 16 text

Potted History of Monitoring You know this better as ... Linux load average 58 extern unsigned long avenrun[ ]; /* Load averages */ 59 60 #define FSHIFT 11 /* nr of bits of precision */ 61 #define FIXED_1 (1<>= FSHIFT; Lines 67–70 are identical to the 1965 Multics formula. See Chap. 4 of my Perl::PDQ book for the details. UNIX load average c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 13 / 47

Slide 17

Slide 17 text

Potted History of Monitoring Unix at Bell Labs c.1970 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Slide 18

Slide 18 text

Potted History of Monitoring Unix at Bell Labs c.1970 CTSS c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Slide 19

Slide 19 text

Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat Multics c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Slide 20

Slide 20 text

Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat Multicsbegat Unics c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Slide 21

Slide 21 text

Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat Multicsbegat Unicsbegat Unix c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Slide 22

Slide 22 text

Potted History of Monitoring Unix at Bell Labs c.1970 CTSSbegat Multicsbegat Unicsbegat Unix Get it? c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Slide 23

Slide 23 text

Potted History of Monitoring Then Came Screens 9:40 Note the mouse in her right hand. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 15 / 47

Slide 24

Slide 24 text

Potted History of Monitoring Unix top: A Legacy App Green ASCII characters on black background c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 16 / 47

Slide 25

Slide 25 text

Potted History of Monitoring Desktop GUI c.1995 Lots of colored spaghetti c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 17 / 47

Slide 26

Slide 26 text

Potted History of Monitoring Static Charts on the Web c.2000 Load average over 24 hr period with 1, 5, 15 min LAs as green, blue, red TS. (which is completely redundant, BTW) As informative as watching a ticker chart on Wall Street c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 18 / 47

Slide 27

Slide 27 text

Potted History of Monitoring Browser-based Dashboards Interminable strip charts are not good for your brain. c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 19 / 47

Slide 28

Slide 28 text

Performance Visualization Basics Outline 1 Let’s Get Calibrated about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 20 / 47

Slide 29

Slide 29 text

Performance Visualization Basics The Central Challenge Find the best cognitive impedance match between the digital computer and the neural computer c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 21 / 47

Slide 30

Slide 30 text

Performance Visualization Basics Cognitive Circuitry is Largely Unknown PerfViz is an N-dimensional problem Brain is trapped in (3 + 1)-dimensions No 5-fold rotational symmetry Physicists have all the fun with SciViz Time dimension becomes animation sequence c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 22 / 47

Slide 31

Slide 31 text

Performance Visualization Basics Your Brain is Easily Fooled All cognition is computation Your brain is a differential analyzer Difference errors produce perceptual illusions c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 23 / 47

Slide 32

Slide 32 text

Monitored Data are Time Series Outline 1 Let’s Get Calibrated about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 24 / 47

Slide 33

Slide 33 text

Monitored Data are Time Series Gothic graphs can hurt your brain (Bad Z value) c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 25 / 47

Slide 34

Slide 34 text

Monitored Data are Time Series There’s a Whole Science of Color c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 26 / 47

Slide 35

Slide 35 text

Monitored Data are Time Series Pastel Colors on White 0 1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 27 / 47

Slide 36

Slide 36 text

Monitored Data are Time Series Pastel Colors on Black 0 1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 28 / 47

Slide 37

Slide 37 text

Monitored Data are Time Series Pastel Colors on Neutral Gray 0 1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 29 / 47

Slide 38

Slide 38 text

Monitored Data are Time Series Coordinated Colors on Neutral Gray 0 1000 2000 3000 4000 5000 200000 400000 600000 800000 1200000 t-Index LIO/s Sandy Bridge 16 VPU Throughput test1.HTT.Turb test2.Turbo test3.HTT test4.AllOff c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 30 / 47

Slide 39

Slide 39 text

Monitored Data are Time Series Time Series Can Reveal Data Correlations 9:50 02:00 07:00 12:00 17:00 22:00 0 10 20 30 CPU% 02:00 07:00 12:00 17:00 22:00 75 85 95 Mem% 02:00 07:00 12:00 17:00 22:00 0 5 10 15 20 ioWait% 02:00 07:00 12:00 17:00 22:00 0.0 0.2 0.4 Time LdAvg-1 server.p.65 : 2012-05-03 to 2012-05-04 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 31 / 47

Slide 40

Slide 40 text

Monitored Data are Time Series But Data Doesn’t Tell All: Monitored Server Consumption 0 50 100 150 200 Time (m:s) Capacity (U%) 00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48 Server saturation Uavg data Umax data Monitored Server Consumption c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 32 / 47

Slide 41

Slide 41 text

Monitored Data are Time Series Beyond Data: Effective Server Consumption 0 50 100 150 200 Time (m:s) Capacity (U%) 00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48 Effective max consumption Server saturation Uavg data Umax data Ueff predicted Lookahead Server Consumption c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 33 / 47

Slide 42

Slide 42 text

Performance Visualization in R Outline 1 Let’s Get Calibrated about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 34 / 47

Slide 43

Slide 43 text

Performance Visualization in R Choose Your Cognitive Z in R 0 1 2 3 4 5 mpg 100 200 300 400 2 3 4 5 10 15 20 25 30 100 200 300 400 disp drat 3.0 3.5 4.0 4.5 5.0 10 15 20 25 30 2 3 4 5 3.0 3.5 4.0 4.5 5.0 wt 4 6 8 10 15 20 25 30 3D Scatterplot 1 2 3 4 5 6 10 15 20 25 30 35 0 100 200 300 400 500 wt disp mpg c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 35 / 47

Slide 44

Slide 44 text

Performance Visualization in R Enhanced Plots in R Raw bench data p Xp 50 100 150 200 250 300 10 20 30 40 50 60 Data smoother p Xp 50 100 150 200 250 300 10 20 30 40 50 60 USL fit p Xp 50 100 150 200 250 300 10 20 30 40 50 60 USL fit + CI bands p Xp 50 100 150 200 250 300 10 20 30 40 50 60 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 36 / 47

Slide 45

Slide 45 text

Performance Visualization in R Chernoff Faces in R Example (using R) library(TeachingDemos) faces2(matrix( runif(18*10), nrow=12), main=’Random Faces’) c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 37 / 47

Slide 46

Slide 46 text

Performance Visualization in R Kiviat and Radar Charts in R Correlation Radar Alp12Mn AvrROE DivToP GrowAPS GrowAsst GrowBPS GrowCFPS GrowDPS GrowEPS GrowSPS HistAlp HistSigm InvVsSal LevGrow Payout5 PredSigm RecVsSal Ret12Mn Ret3Mn Ret1Mn ROE _CshPlow _DDM _EarnMom _EstChgs _EstRvMd _Neglect _NrmEToP _PredEToP _RelStMd _ResRev _SectMom AssetToP ARM_Pref_Earnings AvrCFtoP AvrDtoP AvrEtoP ARM_Sec_Earnings BondSens BookToP Capt CaptAdj CashToP CshFlToP CurrSen DivCuts5 EarnToP Earnvar Earnyld Growth HistBeta IndConc Leveflag Leverag Leverage Lncap Momentum Payoflag PredBeta Ret_11M_Momentum PotDilu Price ProjEgro RecEPSGr SalesToP Size SizeNonl Tradactv TradVol Value VarDPS Volatility Yield CFROI ADJUST ERC RC SPX R1000 MarketCap TotalRisk Value_AX truncate_ret_1mo truncate_PredSigma Residual_Returns ARM_Revenue ARM_Rec_Comp ARM_Revisions_Comp ARM_Global_Rank ARM_Score TEMP EQ_Raw EQ_Region_Rank EQ_Acc_Comp EQ_CF_Comp EQ_Oper_Eff_Comp EQ_Exc_Comp -0.5 0 0.5 1 Example (using R) require(plotrix) corelations <- c(1:97) corelation.names <- names(corelations) <- c("Alp12Mn", "AvrROE", "DivToP", "GrowAPS", "GrowAsst", "GrowBPS", "GrowCFPS", ... corelations <- c(0.223, 0.1884, -0.131, 0.1287, 0.0307, ... par(ps=6) radial.plot(corelations, labels=corelation.names,rp.type="p", main="Correlation Radar", radial.lim=c(-1,1),line.col="blue") c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 38 / 47

Slide 47

Slide 47 text

Performance Visualization in R Treemaps in R GDAT: Top 100 Websites -8e+09 -4e+09 0e+00 4e+09 8e+09 Search/portal Retail Software Media/news Social network Reference Video Portal Blogging Financial Computer Media/news Commerce Tech news Photo sharing Health WeatherAdult Travel Gaming Voip File sharing Online dating Children Recruitment Sport File storage Forum GDAT: Top 100 Websites -8e+09 -4e+09 0e+00 4e+09 8e+09 Google MSNBing Yahoo! Microsoft Facebook YouTube Wikipedia AOL eBay Apple Amazon Blogger Ask Fox Interactive Media Mozilla Real Network Adobe About PayPalWordPress Weather Channel Glam MediaCNN Twitter Skype CBS IMDb Wal-Mart Craigslist BBC Terra CNETOrange Disney Online AT&T NetShelter Technology Flickr Picasa Gorilla Nation Websites WikiAnswers Orkut Chase UOL Bank of America eHow Livejasmin ESPN Zynga Shopzilla Comcast Videolan Everyday Health Network LinkedIn Expedia iG Target Dell Globo Scripps Networks Digital NYTimes LimeWire WebMD FriendFinder Network Shopping.com Nickelodeon Kids and Family Network Classmates Online NetflixMeebo Six Apart Turner Sports & Entertainment Digital Network Comcast Hewlett Packard NexTag NBC Universal Conduit Verizon TripAdvisor Best Buy Monster RTL Network Priceline Network Experian Pornhub iVillage UPS SuperPages Fox News NFL Dailymotion T-Online Reed Business Information Network Free Citibank Vistaprint Sears Tribune Newspapers Electronic Arts Online Megaupload Vodafone Geeknet Example (using R) library(portfolio) bbc <- read.csv("nielsen100-2010.csv") map.market(id=seq(1:100), area=bbc$uniqueAudience, group=bbc$categoryBBC, color=bbc$totalVisits, main="GDAT: Top 100 Websites") There is another treemap pkg on CRAN c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 39 / 47

Slide 48

Slide 48 text

Performance Visualization in R Heatmap of Multiple Servers in Time c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 40 / 47

Slide 49

Slide 49 text

Performance Visualization in R Barry in 2D p1 p3 p2 p3=1/3 p1=1/3 p2=1/3 p2 p3=0.3 p1=0.6 p2=0.1 p1 p3 p1 p1 p3 p2 p3=1/3 p1=1/3 p2=1/3 p2 p3=0.3 p1=0.6 p2=0.1 p1 p3 p1 Barycentric coordinate system for %CPU = %user + %sys + %idle c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 41 / 47

Slide 50

Slide 50 text

Performance Visualization in R Barry in 3D: Tukey-like Rotations Tukey trumps Tufte Barycentric coordinate system for %BW = %unicast + %multicast + %broadcast + %idle c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 42 / 47

Slide 51

Slide 51 text

Possible Hacks Outline 1 Let’s Get Calibrated about Data 2 Potted History of Monitoring 3 Performance Visualization Basics 4 Monitored Data are Time Series 5 Performance Visualization in R 6 Possible Hacks c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 43 / 47

Slide 52

Slide 52 text

Possible Hacks Interactive and Streaming in R R derives from S at Bell Labs (home of Unix) c.1975, 1980, 1988 R scripting language console interface > (x^(k-1)*exp^(-x/s))/(gamma(k)*s^k) cf. Mathematica document paradigm xk−1 e−x/θ Γ(k) θk No fonts, no symbolic computation More recent focus is on enabling: Better IDE integration, e.g., RStudio Browser-based interaction, e.g., Shiny Streaming data acquisition, e.g., R plus Hadoop, but ... R interpreter is single-threaded Needs a full app stack b/w data and R engine Revolution Analytics is in this space Plenty of room for innovative development c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 44 / 47

Slide 53

Slide 53 text

Possible Hacks Some Ideas for Tomorrow 1 Lots of opportunities 2 Coupling simple statistical analysis to monitored data 3 Display the errors in monitored data 4 Replace the black background in Graphite 5 Apply ColorBrewer to Graphite 6 Apply effective capacity consumption to your monitored data 7 Replacing strip charts with animation WARNING Common sense is the p i t f a l l of all performance analysis c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 45 / 47

Slide 54

Slide 54 text

Possible Hacks Modelizing GitHub Growth Since I didn’t discuss modeling part of VAMOOS ... Donnie Berkholz of redmonk.com wrote on his Jan 21, 2013 blog that GitHub will reach: 4 million users near Aug 2013 5 million users near Dec 2013 That’s based on a log-linear model. I claim it’s a log-log model and therefore: 4 million users around Oct 2013 5 million users around Apr 2014 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 46 / 47

Slide 55

Slide 55 text

Possible Hacks Performance Dynamics Company Castro Valley, California www.perfdynamics.com perfdynamics.blogspot.com twitter.com/DrQz Facebook [email protected] OFF: +1-510-537-5758 c 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 47 / 47