How to Build a Performance Test Pipeline from Scratch

Slide 1

Slide 1 text

How to Build a Performance Testing Pipeline from Scratch VALERA ZAKHAROV

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

"A slack client should be fast as fuck!”

Slide 6

Slide 6 text

Let’s fix some perf bugs

Slide 7

Slide 7 text

Trends

Slide 8

Slide 8 text

Alerts

Slide 9

Slide 9 text

Alerts Pre-merge

Slide 10

Slide 10 text

Naive Approach Measure dev version value Compare against baseline Alert if they are diﬀerent Execution time Frame metrics Resource usage Anything that can   be measured latest master

Slide 11

Slide 11 text

Problem Results are variable and sometimes VERY VARIABLE

Slide 12

Slide 12 text

Stats to the Rescue C O M PA R E DEV BUILD VALUES MASTER BUILD VALUES

Slide 13

Slide 13 text

Mann-Whitney U Test P-VALUE CONFIDENCE

Slide 14

Slide 14 text

Statistical Approach Collect set of N values from dev version Test against data set from master Alert if conﬁdence > threshold

Slide 15

Slide 15 text

Collect set of N values from dev version Test against data set from master Alert if diﬀ conﬁdence > threshold WE CONTROL THESE Statistical Approach

Slide 16

Slide 16 text

Higher Number of values = better stats Higher alert threshold = lower false alert rate lower chance of valid detection more device time Statistical Approach

Slide 17

Slide 17 text

For whom?

Slide 18

Slide 18 text

Trust Valid detections Noise =

Slide 19

Slide 19 text

Valid detections Noise = Trust

Slide 20

Slide 20 text

PerfTest Job Backend ! " merge to master open PR Trends trigger perf run run tests + gather data perf data Alert

Slide 21

Slide 21 text

PerfTest Job run tests + gather data

Slide 22

Slide 22 text

Naive approach Build Node runner Build Node backend aggregate metrics test metrics PerfTest Job run tests + gather data

Slide 23

Slide 23 text

Naive approach Build Node runner Build Node backend aggregate metrics test metrics test metrics test metrics PerfTest Job run tests + gather data

Slide 24

Slide 24 text

Naiv-ish approach Build Node runner Build Node backend aggregate metrics test metrics test metrics test metrics device provider get release PerfTest Job run tests + gather data

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

https://code.fb.com/android/the-mobile-device-lab-at-the-prineville-data-center Do you have resources to build this?

Slide 27

Slide 27 text

PerfTest Job Cloud Version Bui runner Build Node backend AGGREGATE METRICS TEST METRICS TEST METRICS TEST METRICS device provider GET RELEASE run tests + gather data

Slide 28

Slide 28 text

Cloud Version # Stability Scalability $ Control PerfTest Job run tests + gather data

Slide 29

Slide 29 text

PerfTest Job Backend ! " merge to master open PR Trends trigger perf run run tests + gather data perf data Alert

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Instrumented Application Instrumentation Test EventTracker.startPerfTracking(Beacon.CHANNEL_SYNC) // code that does channel sync EventTracker.endPerfTracking(Beacon.CHANNEL_SYNC) persist_rtm_start,44 process_rtm_start,19 ms_time_to_connect,703 channel_sync,381

Slide 32

Slide 32 text

Focus on client Network is highly unstable & variable Backend regressions should not block client developers Use Record & Replay github.com/airbnb/okreplay

Slide 33

Slide 33 text

Keep it real We want to catch regressions that represent the real world Preserve the prod object graph Run against release-like conﬁg LargeTest

Slide 34

Slide 34 text

Make it stable Perf tests will be executed a lot Stability bar is very high Don’t compromise on ﬂakiness Use IdlingResource

Slide 35

Slide 35 text

Keep it working

Slide 36

Slide 36 text

PerfTest Job Backend ! " merge to master open PR Trends trigger perf run run tests + gather data perf data Alert

Slide 37

Slide 37 text

Backend createBuild | completeBuild API store & analyze data check sanity perf data

Slide 38

Slide 38 text

Backend Perf Data { "build_info":{ "platform":"android", “author_slack_id”:”W1234567”, "branch_name":"master", "build_cause":"Fixed sort order for starred unreads. (#9838)", "id":8668, "jenkins_build_number":"9287", "author_name":"Kevin Lai", "job_name":"android-master-perf" }, "tests":[ { "status":"complete", "name":"com.Slack.ui.perf.SignInPerfTest#firstSignin_medium", "metric_results":[ {"name":"inflate_flannel_start","value":263}, {"name":"quickswitcher_show",”value”:30}, {"name":"inflate_flannel_start","value":314}, {"name":"quickswitcher_show","value":45} ] } ] }

Slide 39

Slide 39 text

Backend Backend Stack New shiny tech is great … … but use whatever stack you have in house

Slide 40

Slide 40 text

PerfTest Job Backend ! " merge to master open PR Trends trigger perf run run tests + gather data perf data Alert

Slide 41

Slide 41 text

" Trends

Slide 42

Slide 42 text

Slide 43

Slide 43 text

Slide 44

Slide 44 text

Slide 45

Slide 45 text

Slide 46

Slide 46 text

PerfTest Job Backend ! " merge to master open PR Trends trigger perf run run tests + gather data perf data Alert

Slide 47

Slide 47 text

! Alert

Slide 48

Slide 48 text

Slide 49

Slide 49 text

Slide 50

Slide 50 text

Slide 51

Slide 51 text

Slide 52

Slide 52 text

Slide 53

Slide 53 text

Slide 54

Slide 54 text

! More on debugging Pre-merge alerting is great for experimenting Detailed trace info would be great nice https://github.com/facebookincubator/proﬁlo looks promising

Slide 55

Slide 55 text

Slide 56

Slide 56 text

Slide 57

Slide 57 text

Trust

Slide 58

Slide 58 text

& Weekly Stats

Slide 59

Slide 59 text

& Weekly Stats

Slide 60

Slide 60 text

& Infrastructure Stability

Slide 61

Slide 61 text

PerfTest Job Backend ! " merge to master open PR Trends trigger perf run run tests + gather data perf data Alert