PyCon ID 2019 - Introduction to Changepoint Analysis

Slide 1

Slide 1 text

Introduction to Change Point Analysis PyCon ID 2019 Elvyna Tunggawan Data Scientist at Airy

Slide 2

Slide 2 text

Outline 2 Introduction How do we ﬁnd the “change”? Methodology What is change point analysis? What have we learned? Conclusions 1 2 3

Slide 3

Slide 3 text

1. Introduction What is change point analysis?

Slide 4

Slide 4 text

Whoa! At the end of your peaceful Friday, a product manager came and asked a question... 4

Slide 5

Slide 5 text

Our reviews are getting better! It’s because of our new feature release, isn’t it?

Slide 6

Slide 6 text

Is it really getting better?

Slide 7

Slide 7 text

Given a series of data, change point analysis involves detecting the number and location of change points, locations in the data where some feature, for example the mean, changes. What is change point analysis?

Slide 8

Slide 8 text

Oﬄine - all data are processed in one go - main goal: accurate detection of changes Online - data must be processed quickly “on the ﬂy” before new data arrives - main goal: the quickest detection of a change after it has occured There are two types of change point analysis ...

Slide 9

Slide 9 text

2. Methodology How do we ﬁnd the “change”?

Slide 10

Slide 10 text

A. Control charts - Common in process control - Use average, lower & upper control limit - Focus on point-wise error rate - Lower & upper limit is determined based on standard deviation Example of control chart (source)

Slide 11

Slide 11 text

Sample control chart The ﬁrst observation which lies above upper control limit: day 55.

Slide 12

Slide 12 text

B. Change point analysis - Can detect subtle changes frequently missed by control charts - Can be conducted once all observations are collected, to identify change-wise error rate - Based on mean or variance

Slide 13

Slide 13 text

Method 1: Cumulative Sum (CUSUM) Single change point analysis

Slide 14

Slide 14 text

Cumulative Sum (CUSUM) Steps: 1. Calculate mean value of all observations (ȳ) 2. Calculate residuals: diﬀerence between y i and ȳ 3. Set cumulative sum of residuals at 0: S 0 = 0 4. Calculate cumulative sum of residuals: S i = S i-1 + ε i Example of CUSUM plot (source)

Slide 15

Slide 15 text

CUSUM: calculate mean

Slide 16

Slide 16 text

CUSUM: calculate residuals

Slide 17

Slide 17 text

CUSUM: calculate cumulative sum of residuals def calculate_cusum_residuals(df, observation_column=0): mu = df[observation_column].mean() df = df.shift(1) df['residual'] = df[observation_column] - mu df.loc[(df.index == 0), 'residual'] = 0 df['residual_cumsum'] = df['residual'].cumsum() return df

Slide 18

Slide 18 text

CUSUM: calculate cumulative sum of residuals Sudden change is observed at day 52.

Slide 19

Slide 19 text

CUSUM: how conﬁdent are we that the change exists? - Frequentist method - Sampling without replacement: randomly reorder the observations

Slide 20

Slide 20 text

CUSUM: how conﬁdent are we that the change exists? def calculate_residual_difference(df): ## calculate difference between maximum and minimum cumsum residuals resid_max = df['residual_cumsum'].max() resid_min = df['residual_cumsum'].min() resid_diff = resid_max - resid_min return resid_diff

Slide 21

Slide 21 text

N = 1000 ## determine number of iteration X = 0 ## occurrence when sample residual difference < observed residual difference for i in np.arange(0,N): _sample = pd.DataFrame( np.random.choice(df[0], size = df.shape[0], replace = False) ) _sample = calculate_cusum_residuals(_sample) _sample_resid_diff = calculate_residual_difference(_sample) if _sample_resid_diff < resid_diff: X += 1 confidence_level = 100 * X / N print("Confidence level: {:.2f}%".format(confidence_level)) CUSUM: how conﬁdent are we that the change exists?

Slide 22

Slide 22 text

Method 2: Structure change model - MSE Estimator Single change point analysis

Slide 23

Slide 23 text

Structure Change Model: MSE Estimator Steps: 1. Split the data into 2 segments - segment 1 = {1, …, m} - segment 2 = {m+1, …, n} 2. Calculate average value of each segment: X ̄ 1 and X ̄ 2 3. Calculate mean squared error of observation in each segment 4. Value of m which minimizes the MSE is the best estimator of the last point before the change occured n n n

Slide 24

Slide 24 text

MSE estimator: intuition

Slide 25

Slide 25 text

MSE estimator: intuition

Slide 26

Slide 26 text

MSE estimator: intuition Value of m which minimizes the MSE is the best estimator of the last point before the change occured → day 52

Slide 27

Slide 27 text

What if we’re looking for more than one change point?

Slide 28

Slide 28 text

Multiple Change Point: Binary Segmentation Schematic view of the binary segmentation algorithm (source)

Slide 29

Slide 29 text

Libraries 29 ruptures bayesloop fbProphet changepoint bcp strucchange cpm

Slide 30

Slide 30 text

Confused? You can apply Bayesian approach too! 30 —Anonymous

Slide 31

Slide 31 text

1. Set prior distribution of μ 1 , μ 2 , and overall σ 2. The changepoint could occur in τ ∈ {1,...,n} 3. Assign: 4. Produce the sample! Bayesian Approach

Slide 32

Slide 32 text

Bayesian Approach: PyMC3 - Example import pymc3 as pm ## set number of sample ## set t = time, from 0 to length of observations samples = 5000 ## number of iteration t = np.arange(0, len(z)) ## array of observation positions (time) with pm.Model() as model: ## define uniform priors for the mean values mu_a = pm.Uniform('mu_a', 0, 10) mu_b = pm.Uniform('mu_b', 0, 10) sigma = pm.HalfCauchy('sigma', np.std(z)) tau = pm.DiscreteUniform('tau', t.min(), t.max()) ## define stochastic variable mu mu = pm.math.switch(tau >= t, mu_a, mu_b) observation = pm.Normal('observation', mu, sigma, observed = z) trace = pm.sample(samples, step = pm.NUTS()) burned_trace = trace[1000:]

Slide 33

Slide 33 text

Bayesian Approach: PyMC3 - Changepoint distribution

Slide 34

Slide 34 text

Bayesian Approach: PyMC3 - Mean distribution

Slide 35

Slide 35 text

Bayesian Approach: Estimate the change point

Slide 36

Slide 36 text

3. Conclusions What have we learned?

Slide 37

Slide 37 text

Materials: https://github.com/elvyna/pycon-id-2019 Find me on Twitter: @vexenta 37 Thanks!

Slide 38

Slide 38 text

Want to learn more? Killick, R. (2017). Introduction to optimal changepoint detection algorithms. useR! Tutorial 2017 Kass-Hout, T. (2010). Change point analysis. Slideshare. Bellei, C. (2016). Changepoint Detection. Part I - A Frequentist Approach. [Blog] Bellei, C. (2017). Changepoint Detection. Part II - A Bayesian Approach. [Blog] Davidson-Pilon, C. (2015). Chapter 1 - Introduction - PyMC3. Probabilistic Programming and Bayesian Methods for Hackers. Slide template by Slidesgo 38