Slide 1

Slide 1 text

1 Probabilistic outlier detection and visualization of smart metre data Rob J Hyndman Cairns, June 2017

Slide 2

Slide 2 text

Outline 1 Irish smart metre data 2 Quantiles conditional on time of week 3 Finding typical and unusual households 4 Visualization via embedding 5 Spectral clustering and embedding 6 Features and limitations 2

Slide 3

Slide 3 text

Irish smart metre data Figure: http://solutions.3m.com 500 households from smart metering trial Electricity consumption at 30-minute intervals between 14 July 2009 and 31 December 2010 Heating/cooling energy usage excluded 3

Slide 4

Slide 4 text

Irish smart metre data 0 2 4 6 0 200 400 Days Demand (kWh) Demand for ID: 1718 4

Slide 5

Slide 5 text

Irish smart metre data 0 1 2 3 4 5 0 200 400 Days Demand (kWh) Demand for ID: 1550 5

Slide 6

Slide 6 text

Irish smart metre data 0 2 4 6 0 200 400 Days Demand (kWh) Demand for ID: 1539 6

Slide 7

Slide 7 text

Outline 1 Irish smart metre data 2 Quantiles conditional on time of week 3 Finding typical and unusual households 4 Visualization via embedding 5 Spectral clustering and embedding 6 Features and limitations 7

Slide 8

Slide 8 text

Quantiles conditional on time of week Compute sample quantiles at p = 0.01, 0.02, . . . , 0.99 for each household and each half-hour of the week. 168 probability distributions per household. Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Demand (kWh) Demand for ID: 1718 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 1 2 3 4 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 8

Slide 9

Slide 9 text

Quantiles conditional on time of week Compute sample quantiles at p = 0.01, 0.02, . . . , 0.99 for each household and each half-hour of the week. 168 probability distributions per household. Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 1 2 3 4 5 Demand (kWh) Demand for ID: 1550 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 1 2 3 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 9

Slide 10

Slide 10 text

Quantiles conditional on time of week Compute sample quantiles at p = 0.01, 0.02, . . . , 0.99 for each household and each half-hour of the week. 168 probability distributions per household. Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Demand (kWh) Demand for ID: 1539 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 10

Slide 11

Slide 11 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: 11

Slide 12

Slide 12 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros 11

Slide 13

Slide 13 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support 11

Slide 14

Slide 14 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support high skewness 11

Slide 15

Slide 15 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support high skewness Avoids missing data issues and variation in series length 11

Slide 16

Slide 16 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support high skewness Avoids missing data issues and variation in series length Avoids timing of household events, holidays, etc. 11

Slide 17

Slide 17 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support high skewness Avoids missing data issues and variation in series length Avoids timing of household events, holidays, etc. Allows clustering of households based on probabilistic behaviour rather than coincident behaviour. 11

Slide 18

Slide 18 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support high skewness Avoids missing data issues and variation in series length Avoids timing of household events, holidays, etc. Allows clustering of households based on probabilistic behaviour rather than coincident behaviour. Allows identification of anomalous households. 11

Slide 19

Slide 19 text

Quantiles conditional on time of week Sample quantiles better than kernel density estimate: presence of zeros non-negative support high skewness Avoids missing data issues and variation in series length Avoids timing of household events, holidays, etc. Allows clustering of households based on probabilistic behaviour rather than coincident behaviour. Allows identification of anomalous households. Allows estimation of typical household behaviour. 11

Slide 20

Slide 20 text

Outline 1 Irish smart metre data 2 Quantiles conditional on time of week 3 Finding typical and unusual households 4 Visualization via embedding 5 Spectral clustering and embedding 6 Features and limitations 12

Slide 21

Slide 21 text

Pairwise distances The time series of 535 × 48 observations per household is mapped to a set of 7×48×99 quantiles giving a bivariate surface for each household. Can we compute pairwise distances between all households? 13 −→ ← ? → Distance

Slide 22

Slide 22 text

Jensen-Shannon distances Kullback-Leibler divergence between two densities D(p, q) = ∞ ∞ p(x) log p(x) q(x) dx 14

Slide 23

Slide 23 text

Jensen-Shannon distances Kullback-Leibler divergence between two densities D(p, q) = ∞ ∞ p(x) log p(x) q(x) dx Not symmetric: D(p, q) = D(q, p) 14

Slide 24

Slide 24 text

Jensen-Shannon distances Kullback-Leibler divergence between two densities D(p, q) = ∞ ∞ p(x) log p(x) q(x) dx Not symmetric: D(p, q) = D(q, p) Jensen-Shannon distance between two densities JS(p, q) = [D(p, r) + D(q, r)]/2 where r = (p + q)/2 14

Slide 25

Slide 25 text

Jensen-Shannon distances Kullback-Leibler divergence between two densities D(p, q) = ∞ ∞ p(x) log p(x) q(x) dx Not symmetric: D(p, q) = D(q, p) Jensen-Shannon distance between two densities JS(p, q) = [D(p, r) + D(q, r)]/2 where r = (p + q)/2 Distance between two households ∆ij = 7×48 t=1 JS(pt , qt) 14

Slide 26

Slide 26 text

Kernel matrix and density ranking Similarity between two households wij = exp(−∆2 ij /h2). 15

Slide 27

Slide 27 text

Kernel matrix and density ranking Similarity between two households wij = exp(−∆2 ij /h2). Row sums of the kernel matrix gives a scaled kernel density estimate of households: ˆ fi = n j=1 wij h is bandwidth in Gaussian kernel. Households can be ranked by density values. 15

Slide 28

Slide 28 text

Typical households Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 2.5 5.0 7.5 Demand (kWh) Demand for ID: 1672 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 16

Slide 29

Slide 29 text

Typical households Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Demand (kWh) Demand for ID: 1058 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 17

Slide 30

Slide 30 text

Typical households Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 2.5 5.0 7.5 Demand (kWh) Demand for ID: 1183 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 18

Slide 31

Slide 31 text

Anomalous households Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 0.5 1.0 Demand (kWh) Demand for ID: 1881 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 0.3 0.6 0.9 1.2 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 19

Slide 32

Slide 32 text

Anomalous households Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 0.5 1.0 1.5 2.0 Demand (kWh) Demand for ID: 1607 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 0.5 1.0 1.5 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 20

Slide 33

Slide 33 text

Anomalous households Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 0.5 1.0 1.5 2.0 2.5 Demand (kWh) Demand for ID: 1821 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0.0 0.5 1.0 1.5 2.0 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 21

Slide 34

Slide 34 text

Outline 1 Irish smart metre data 2 Quantiles conditional on time of week 3 Finding typical and unusual households 4 Visualization via embedding 5 Spectral clustering and embedding 6 Features and limitations 22

Slide 35

Slide 35 text

Laplacian eigenmaps Idea: Embed conditional densities in a 2d space where the distances are preserved “as far as possible”. 23

Slide 36

Slide 36 text

Laplacian eigenmaps Idea: Embed conditional densities in a 2d space where the distances are preserved “as far as possible”. Let W = [wij] where wij = exp(−∆2 ij /h2). D = diag(ˆ fi) where ˆ fi = n j=1 wij L = D − W (the Laplacian matrix). 23

Slide 37

Slide 37 text

Laplacian eigenmaps Idea: Embed conditional densities in a 2d space where the distances are preserved “as far as possible”. Let W = [wij] where wij = exp(−∆2 ij /h2). D = diag(ˆ fi) where ˆ fi = n j=1 wij L = D − W (the Laplacian matrix). Solve generalized eigenvector problem: Le = λDe. 23

Slide 38

Slide 38 text

Laplacian eigenmaps Idea: Embed conditional densities in a 2d space where the distances are preserved “as far as possible”. Let W = [wij] where wij = exp(−∆2 ij /h2). D = diag(ˆ fi) where ˆ fi = n j=1 wij L = D − W (the Laplacian matrix). Solve generalized eigenvector problem: Le = λDe. Let ek be eigenvector corresponding to kth smallest eigenvalue. 23

Slide 39

Slide 39 text

Laplacian eigenmaps Idea: Embed conditional densities in a 2d space where the distances are preserved “as far as possible”. Let W = [wij] where wij = exp(−∆2 ij /h2). D = diag(ˆ fi) where ˆ fi = n j=1 wij L = D − W (the Laplacian matrix). Solve generalized eigenvector problem: Le = λDe. Let ek be eigenvector corresponding to kth smallest eigenvalue. Then e2 and e3 create an embedding of households in 2d space. 23

Slide 40

Slide 40 text

Key property of Laplacian embedding Let yi = (e2,i , e3,i) be the embedded point corresponding to household i. Then the Laplacian eigenmap minimizes ij wij(yi − yj)2 = y Ly such that y Dy = 1. 24

Slide 41

Slide 41 text

Key property of Laplacian embedding Let yi = (e2,i , e3,i) be the embedded point corresponding to household i. Then the Laplacian eigenmap minimizes ij wij(yi − yj)2 = y Ly such that y Dy = 1. the most similar points are as close as possible. 24

Slide 42

Slide 42 text

Key property of Laplacian embedding Let yi = (e2,i , e3,i) be the embedded point corresponding to household i. Then the Laplacian eigenmap minimizes ij wij(yi − yj)2 = y Ly such that y Dy = 1. the most similar points are as close as possible. First eigenvalue is 0 due to translation invariance. 24

Slide 43

Slide 43 text

Key property of Laplacian embedding Let yi = (e2,i , e3,i) be the embedded point corresponding to household i. Then the Laplacian eigenmap minimizes ij wij(yi − yj)2 = y Ly such that y Dy = 1. the most similar points are as close as possible. First eigenvalue is 0 due to translation invariance. Equivalent to optimal embedding using Laplace-Beltrami operator on manifolds. 24

Slide 44

Slide 44 text

Outliers shown in embedded space q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1881 1607 1821 −4 −2 0 2 −2 −1 0 1 2 Comp1 Comp2 HDRs q q q q 1 50 99 >99 Laplacian embedding (HDRs on original space) 25

Slide 45

Slide 45 text

Outliers shown in embedded space q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1 2 3 4 5 6 7 8 9 10 −4 −2 0 2 −2 −1 0 1 2 Comp1 Comp2 HDRs q q q q 1 50 99 >99 Laplacian embedding (HDRs on original space) 26

Slide 46

Slide 46 text

Outliers computed in embedded space: q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1881 1607 1816 −4 −2 0 2 −2 −1 0 1 2 Comp1 Comp2 HDRs q q q q 1 50 99 >99 Laplacian embedding (HDRs on embedded space) 27

Slide 47

Slide 47 text

Outline 1 Irish smart metre data 2 Quantiles conditional on time of week 3 Finding typical and unusual households 4 Visualization via embedding 5 Spectral clustering and embedding 6 Features and limitations 28

Slide 48

Slide 48 text

Spectral clustering Let W = [wij] where wij = exp(−∆2 ij /h2). D = diag(ˆ fi) where ˆ fi = n j=1 wij A = D−1/2WD−1/2 (the affinity matrix). Top k eigenvectors of A are clustered using k-means. 29

Slide 49

Slide 49 text

Spectral clustering Let W = [wij] where wij = exp(−∆2 ij /h2). D = diag(ˆ fi) where ˆ fi = n j=1 wij A = D−1/2WD−1/2 (the affinity matrix). Top k eigenvectors of A are clustered using k-means. Not necessarily same h as used for embedding or density-ranking. 29

Slide 50

Slide 50 text

Clustering shown in embedded space q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −4 −2 0 2 −2 −1 0 1 2 Comp1 Comp2 Cluster q q q 1 2 3 Laplacian embedding with spectral clustering 30

Slide 51

Slide 51 text

Clustering shown in embedded space q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1081 1058 1810 −4 −2 0 2 −2 −1 0 1 2 Comp1 Comp2 Cluster q q q 1 2 3 Laplacian embedding with spectral clustering 31

Slide 52

Slide 52 text

Typical household in cluster 1 Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Demand (kWh) Demand for ID: 1081 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 32

Slide 53

Slide 53 text

Typical household in cluster 2 Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Demand (kWh) Demand for ID: 1058 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 33

Slide 54

Slide 54 text

Typical household in cluster 3 Monday Tuesday Wednesday Thursday Friday Saturday Sunday 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 8 Demand (kWh) Demand for ID: 1810 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 6 12 18 24 0 2 4 6 Time of day Quantiles 0.1 0.3 0.5 0.7 0.9 Probability 34

Slide 55

Slide 55 text

Outline 1 Irish smart metre data 2 Quantiles conditional on time of week 3 Finding typical and unusual households 4 Visualization via embedding 5 Spectral clustering and embedding 6 Features and limitations 35

Slide 56

Slide 56 text

Features and limitations Features of approach Converting time series to quantile surfaces conditional on time of week. Using pairwise distances between households Using kernel matrices for density ranking, embedding and clustering 36

Slide 57

Slide 57 text

Features and limitations Features of approach Converting time series to quantile surfaces conditional on time of week. Using pairwise distances between households Using kernel matrices for density ranking, embedding and clustering Unresolved issues Need to select the bandwidth h in constructing the similarity matrix. Three different uses of bandwidth: density-ranking, embedding, clustering. Different bandwidth in each case? The use of pairwise distances makes it hard to scale this algorithm. 36