Slide 1

Slide 1 text

NILMTK An Open Source Toolkit for Non-intrusive Load Monitoring 1

Slide 2

Slide 2 text

NILMTK team Nipun Batra Amarjeet Singh Mani Srivastava Jack Kelly William Knottenbelt Haimonti Dutta Oliver Parson Alex Rogers 2

Slide 3

Slide 3 text

Non-intrusive load monitoring (Energy disaggregation) “Process of estimating the energy consumed by individual appliances given just a whole-house power meter reading” 3

Slide 4

Slide 4 text

Wait a minute! This sounds complicated Would it help? 4

Slide 5

Slide 5 text

Jane goes to the market 5

Slide 6

Slide 6 text

Jane spends 200 pounds on her purchases 6

Slide 7

Slide 7 text

Jane’s husband John is worried with the expenses 7

Slide 8

Slide 8 text

He spends some time and looks at the purchase list 8

Slide 9

Slide 9 text

Do you think the itemized billing helped him? NILM is the same, but for energy! 9

Slide 10

Slide 10 text

Quiz time! Identify this famous CS scientist 10

Slide 11

Slide 11 text

Quiz time! Identify this famous CS scientist 11 That ain’t any great scientist. That’s me on my first birthday in 1990… This is not too far from the time when NILM was first discussed

Slide 12

Slide 12 text

Giving credit where it is due 12

Slide 13

Slide 13 text

NILM interest explosion 1. National smart meter rollouts 2. Reduced hardware costs 3. International meetings – NILM workshop 2012, 2014; EPRI NILM 2013 4. Public datasets 5. Startups 13

Slide 14

Slide 14 text

“Data is the new oil” • 9 NILM datasets and counting (few not specific to NILM) • Across 6 countries (India, UK, US, Canada, EU) • Measure aggregate and appliance level data • Across 3 colors  – REDD – BLUED – GREEND 14

Slide 15

Slide 15 text

The industry is interested! 15

Slide 16

Slide 16 text

So, is everything so rosy? Not quite! Else we won’t be here 16

Slide 17

Slide 17 text

The scientific method “The scientific method is a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge” as per wiki 17

Slide 18

Slide 18 text

3 core obstacles preventing comparison of state-of-the-art 18

Slide 19

Slide 19 text

1. Hard to assess generality • Subtle differences in aims of different data sets • Previous contributions evaluated only on single dataset. • Non-trivial to set up similar experimental conditions for direct comparison. 19

Slide 20

Slide 20 text

2. Lack of comparison against same benchmarks • Newly proposed algorithms rarely compared against same benchmarks. • Lack of “open source” reference algorithms  often lead to reimplementation. 20

Slide 21

Slide 21 text

3. “Inconsistent” disaggregation performance metrics • Different performance metrics proposed in the past. • Different formulae for same metric, eg. 4+ versions of “energy assigned” 21

Slide 22

Slide 22 text

What is NILMTK? Open source NILM toolkit 22

Slide 23

Slide 23 text

What does it do? Enable easy comparative analysis of NILM algorithms across data sets. 23

Slide 24

Slide 24 text

How does it do that? Provides a pipeline from data sets to metrics to lower the entry barrier for researchers. 24

Slide 25

Slide 25 text

NILMTK pipeline REDD BLUED UK- DALE Statistics NILMTK- DF Training Preprocessing Model Disaggregation Metrics Data interface 25

Slide 26

Slide 26 text

Data Format REDD BLUED UK- DALE Statistics NILMTK- DF Training Preprocessing Model Disaggregation Metrics Data interface 26

Slide 27

Slide 27 text

Data Format • We propose NILMTK-DF: a common data format. • Provide importers for 6 datasets: REDD, SMART*, Pecan street, iAWE, AMPds, UK-DALE • Both flat file and efficient binary storage format 27

Slide 28

Slide 28 text

The fun of data! 28

Slide 29

Slide 29 text

Standardizing nomenclature Fridge Refrigerator FGE 29

Slide 30

Slide 30 text

Metadata • Geographic coordinates • Type of appliance- hot, cold, dry? • Metering hierarchy • Parameters measured 30

Slide 31

Slide 31 text

Standard nomenclature + Metadata + Datasets = Comparing power draw of washing machines across US (REDD) and UK (UK-DALE) 31

Slide 32

Slide 32 text

Standard nomenclature + Metadata + Datasets = Top 5 appliance according to energy consumption across geographies 32 US UK INDIA

Slide 33

Slide 33 text

NILMTK pipeline REDD BLUED UK- DALE Statistics NILMTK- DF Training Preprocessing Model Disaggregation Metrics Data interface 33

Slide 34

Slide 34 text

Statistics 0 10 20 30 40 50 60 70 80 90 100 REDD Smart* Pecan AMPds iAWE UK_DALE % energy submetered • Energy submetered: Sum of energy of all appliance/Energy at mains level • More energy submetered  More ground truth 34

Slide 35

Slide 35 text

Statistics • Appliance usage patterns • Correlations with weather • Appliance power demands 35

Slide 36

Slide 36 text

Diagnostics • Every data set has problems  NILMTK provides diagnostic functions for common problems. • %Lost samples (per interval and whole), uptime % lost samples in house 1 of REDD dataset 36

Slide 37

Slide 37 text

Preprocessing REDD BLUED UK- DALE Statistics NILMTK- DF Training Preprocessing Model Disaggregation Metrics Data interface 37

Slide 38

Slide 38 text

Preprocessing • Correct common problems (as per diagnosis). • Other standard NILM preprocessors: – Interpolating, filtering implausible – Downsample to lower frequency – Select Top-k-appliances by energy consumption 38

Slide 39

Slide 39 text

Heart of NILMTK REDD BLUED UK- DALE Statistics NILMTK- DF Training Preprocessing Model Disaggregation Metrics Data interface 39

Slide 40

Slide 40 text

Training • NILMTK provides two benchmark algorithms –Combinatorial optimization (CO) [Proposed by Hart] –Factorial hidden Markov model (FHMM) [More recent, more complex] 40

Slide 41

Slide 41 text

Model • Beyond the usual train and disaggregate, NILMTK allows importing and exporting learnt models • Allows NILM to be deployed in “real world settings” • Action speaks louder than words!! Demo follows! 41

Slide 42

Slide 42 text

Disaggregate! • Quite a bit of work before we disaggregate • We performed – CO and FHMM based disaggregation across first home of each dataset – Detailed disaggregation analysis across the home in iAWE (dataset from India) 42

Slide 43

Slide 43 text

Disaggregation across multiple datasets • CO as good as FHMM across iAWE, UKPD, Pecan datasets –Space heating contributes 60% in Pecan and 35% in iAWE. Both approaches able to detect with fair ease 43 And I thought that CO was really outdated…

Slide 44

Slide 44 text

Disaggregation across multiple datasets 44 • FHMM outperforms CO across REDD, Smart*, AMPds • This is expected as FHMM models time variations. • CO exponentially quicker than FHMM

Slide 45

Slide 45 text

Detailed disaggregation in iAWE dataset (India) • CO and FHMM perform similar • Appliances such as air conditioners way easier to disaggregate • Complex appliances (laptops and washing machines) – not so good  45

Slide 46

Slide 46 text

NILMTK pipeline REDD BLUED UK- DALE Statistics NILMTK- DF Training Preprocessing Model Disaggregation Metrics Data interface 46

Slide 47

Slide 47 text

Metrics • NILMTK provides: –General machine learning metrics • Precision, Recall, F-score –Specialized metrics for NILM • Error in total energy assigned, RMS error in assigned power,.. –Both event based and total power based NILM metrics. 47

Slide 48

Slide 48 text

Demo time!! 48

Slide 49

Slide 49 text

Conclusions Three core challenges in NILM research 1. Hard to address generality 2. Lack of comparison against same benchmarks 3. Inconsistent disaggregation performance metrics How NILMTK addresses these challenges 1. Standard input and output formats (Addresses #1) 2. Parsers for 6 NILM data sets (Addresses #1, #2) 3. Two benchmark NILM algorithms (Addresses #1, #2) 4. Statistics, diagnostics and preprocessing (Addresses #1, #2) 5. Metrics for different NILM use cases (Addresses #1) 49

Slide 50

Slide 50 text

Backup 50

Slide 51

Slide 51 text

Combinatorial optimization • Seeks to find the optimal combination of appliances’ power draw to minimize residual energy. • Similar to subset-sum problem and thus NP-complete  • Power draw is not related in time 51

Slide 52

Slide 52 text

Combinatorial optimization Appliance Off power On power Air conditioner (AC) 0 2000 Refrigerator 0 200 If total power observed = 210  AC is OFF and Refrigerator is ON 52

Slide 53

Slide 53 text

Combinatorial optimization Appliance Off power On power Air conditioner (AC) 0 2000 Refrigerator 0 200 If total power observed = 2000  AC is ON and Refrigerator is OFF 53

Slide 54

Slide 54 text

Combinatorial optimization Appliance Off power On power Air conditioner (AC) 0 2000 Refrigerator 0 200 If total power observed = 2230  AC is ON and Refrigerator is ON 54

Slide 55

Slide 55 text

FHMM • Each appliance modeled as HMM – Power draw related in time If TV is on right now, likely to be on next second. • Exact inference scales worse than CO 55

Slide 56

Slide 56 text

A bit of history Seminal work on NILM done at MIT dates back to early 1980s – A good 6-7 years before I was born! 56

Slide 57

Slide 57 text

Field progress 0 10 20 30 40 50 60 70 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 # Papers citing the seminal work per year What happened here? 57