A Deep Learning use case for water end use detection by Roberto Díaz and José Antonio Sánchez at Big Data Spain 2017

A DEEP LEARNIN G USE CASE FOR WATER END USE
DETECTION

Motivatio n

Motivation Urban water supply • We need good demand management
policies to achieve a good sustainable development. • Adding a new water source imply: • Higher costs. • Environmental damage. • Poorer quality. • “the largest, least expensive, and most environmentally sound source of water […] is the water currently being wasted in every sector of our economy”.[1] [1] Gleick, P. et al. (2003). Waste Not, Want Not: The Potential for Urban Water Conservation

Motivation End uses of water • Residential use of water
-> The 70% of total water consumption. • A good understanding of the demand and its characterization could be very useful to create good management policies. • Several problems can be addressed using AI techniques: • Final use classification (dishwasher, toilet, irrigation, taps). • Water demand forecasting.

Motivation The problem • Installing a meter on each water
device is very expensive and intrusive. • To overcome this problem, it is possible to install a unique precision meter at the home main water connection. • Predictive models can read these meters and make predictions: • End use: Classification problem. • Forecasting: Regression problem.

Motivation Data Source • Canal de Isabel II monitors since
2008 a sample of 300 homes spread over the region of Madrid. • 15 million hours monitored for 9 years. • 35 million of events. • The sample is stratified and spread along different geographical areas of the region to be considered representative of the domestic users of Madrid. • The goal is the study of patterns of consumption and end uses of urban water.

Motivation Project information 7 PROJECT TITLE Pattern Recognition in Residential
End Uses of Water RESEARCH LINE Assurance of the balance (availability / demand) CLIENT Canal de Isabel II CONSORTIU M Exeleria: Preprocessing tasks Treelogic: Machine Learning tasks GOAL Developing an automatic system for identifying the end uses of water in the domestic applications, from the signals registered by water meters, using advanced techniques of machine learning, such as artificial neural networks (ANN) or other statistical methods

Starting Point

Starting Point Hardware Infraestructure 9 WATER METER DATALOGGER

Starting Point • Data was labeled by operators (experts) who
classify water use events using specialized software. • This task involves a considerable amount of man-hours. • 1 hour of an operator to analyse a two- week period of data from each installation.

Starting Point 8 type of events SHOWERS (INCLUDING BATHTUBS) DISHWASHER
WASHING MACHINE CISTERNS LEAKS FAUCETS POOL IRRIGATION

Previous analysis and visualization

Previous analysis and visualization Pulse to Flow 1 DATE COUNTER
(Number of accumulated pulses) 01/06/2008 0:47:35 31542 01/06/2008 0:48:13 31543 01/06/2008 0:48:55 31544 01/06/2008 0:49:38 31545 01/06/2008 1:20:29 31546 01/06/2008 1:20:46 31547 01/06/2008 1:21:03 31548 01/06/2008 1:21:20 31549 ………………… ………………… BASELINE INFORMATION • Date • Number of pulse

Previous analysis and visualization Pulse to Flow 1

Previous analysis and visualization Episodes • An episode is a
period of time where the flow is distinct to zero and is between two zero-flow instants. • An episode may consist of one or more events. • An event only belongs to an episode.

Previous analysis and visualization Events • An event is an
elementary unit of consumption that occurs in a period of time of enough duration, in which the instant flow can be clearly differentiated from the rest. • A particular domestic use may consist of one or more events. • One or several events that converge in time form an episode.

3 domestic uses which involve 4 events FAUCETS 1 EVENT
CISTERNS 1 EVENT WASHING MACHINE CYCLE 1 WASHING MACHINE CYCLE 2 2 EVENT Q T

3 domestic uses which involve 4 events and 3 episodes
Q T EPISODE 1 EPISODE 2 EPISODE 3 FAUCETS 1 EVENT CISTERNS 1 EVENT WASHING MACHINE CYCLE 1 WASHING MACHINE CYCLE 2

Previous analysis and visualization Events identification • When an episode
consist of more than one event, the events are overlapped. • Graphically the events are "stacked" on others as a ladder. • How do we discriminate events? o It is the same event if… ⁻ The flow rate keeps constant or the change is not significant. o It is a different event if… ⁻ There is a significant change in the flow rate.

Approach

Approach Feature Extraction 2 37 FEATURES WERE EXTRACTED FROM EVERY
EVENT: duration, volume, maximum flow, initial Gradient, …

Approach Deep Neural Networks • Deep Learning (DL) is a
major breakthrough in artificial intelligence with a high potential for predictive applications. • It has been recognized as one of ten breakthrough technologies according to MIT Technology Review. • DL has gone from being considered an academic field to being applied in engineering thanks to frameworks like TensorFlow or CNTK. • Very powerful, they can solve very complex tasks. • They require a large amount of data. • Large training times, they require specialized hardware for complex tasks. • Slow classifiers.

Approach Deep Neural Networks

Approach Speedup (SDAs) • A disadvantage of the backpropagation algorithm
is that the training fast in the last layers (near the output), but very slow if we are far away from the output. • If we don’t have a lot of training data to perform a high number of back propagation iterations, we only train the layers at the output.. • If we can initialize the neural network with useful weights in the firsts layers, the training procedure will speed up. • If that initialization is not supervised we can use unlabeled data.

Approach Speedup (SDAs) • Imagine a neural network that has
one hidden layer • With the same number of neurons in the input than in the output. • We add noise to the input and we train the network to recover the original input. • The network will learn to generalize because it will receive different data with the same output. • The network will learn to identify useful features of the image.

Approach Speedup (SDAs) • How can I initialize an MLP
using autoencoders? • Stacking them. • We can remove the decoding layer and attach another autoencder in the output. • An autoencoder can just find basic useful weights. • The idea of autoencder in Deep Learning is using several autoencers training in a sequential way using the hidden layer as an input of the next autoencoder.

Speedup (SDAs)

Results Benchmark ACCURACY OF DEEP NEURAL NETWORKS 81.78% In 1l
meters 91.19% In 0,1l meters ACCURACY OF SVMs 67.41% In 1l meters 84.78% In 0,1l meters

Results Accuracy comparison of Deep Neural Networks and SVMs in
every water use

What else can Deep Learning do for water supply companies…?

What else…? Time Series • Water supply companies are also
interested on: • Water demand forecasting. • Weather or quantitative precipitation forecast: o Volume of water in reservoirs. o Alert systems. • Time series forecasting.

What else…? RNN 3 • Traditional NN assume that inputs
are independents of each other. • RNN incorporate memory that contains the essence of what has happened previously.

What else…? LSTM 3 • A variant of RNN, capable
of learning long term dependences. • Internal architecture more complex than Simple RNN architecture. • Most widely used type of RNN.

[**] https://datamarket.com/data/set/22ls/monthly-precipitation-mm-southwestern-mountain-region-1932-1966 Southwestern precipitation forecast Monthly precipitation (mm.) Southwester mountain
region (1932-1966) DATA SET 420 ROWS 2 COLUMNS (number of month, precipitation)

Solution • LSTM network o Input – 20 timesteps, 1
feature o Hidden Layer 1 – 20 LSTM o Output – 1 neuron • MSE -> 16,94 4

Monthly precipitation (mm.) · Prediction (last 63 months)

CONCLUSIO NS 01 02 03 04 Data science can help
us to UNDERSTAND of the water demand and its characterization. Deep Learning Models can achieve very good results in terms of ACCURACY when is trained using large enough datasets. This METHODOLOGY is actually in use for processing data from the Panel for residential consumption patterns assessment and end- uses monitoring project of Canal de Isabel II in Madrid. It could be very USEFUL to create good management policies.

THANKS ! Roberto Díaz LEADER OF THE DATA SCIENCE RESEARCH
José Antonio Sánchez SENIOR R&D ENGINEER THANKS !

Contacto Parque Tecnológico de Asturias Parcela 30 E33428 Llanera Asturias
ESPAÑA Avda. Manoteras, 38 Oficina D614 E28050 Madrid ESPAÑA T +34 902 286 386 [email protected] www.Treelogic.com

A Deep Learning use case for water end use dete...

A Deep Learning use case for water end use detection by Roberto Díaz and José Antonio Sánchez at Big Data Spain 2017

More Decks by Big Data Spain

Other Decks in Technology

Featured

Transcript