Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in Wireless LANs

Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in
Wireless LANs Koji Yamamoto and Takayuki Nishio Graduate School of Informatics, Kyoto University 2019-05-24 2019/5/24 TUT20 part2: Deep Learning in WLANs 1

Talk Outline 0. What can we do with Deep Learning?
Case study in mmWave WLANs 1. What is ML? • Motivation • Basics of ML 2. ML for Wireless LANs • Case study: mmWave received power prediction from images 3. ML in Wireless LANs • Transfer Learning and application example • Federated Learning TUT20 part2: Deep Learning in WLANs 2019/5/24 2

mmWave WLANs 3 Path 1 was blocked Path 1 was
blocked 0 100 200 300 400 500 600 0 5 10 15 20 25 30 35 40 45 50 Throughput of path 1 (Mbit/s) Time (s) Path 1 IEEE 802.11ad/ay • Beyond Gbit/s communications using bandwidth of 2.16 GHz or more. • Strong attenuation (15 dB ~) when human blocks line of sight (LOS) mmWave AP Experimental result using IEEE 802.11ad devices 2019/5/24 TUT20 part2: Deep Learning in WLANs

mmWave received power prediction based on ML and camera Received
power in 0.5s ahead is accurately predicted LOS path blockage Strong signal attenuation occur Human blockage Key idea: ML and camera Based on vision information, blockage is predicted. Transmitter Receiver Future received power Deep Neural Network we need to predict timing and strength of the attenuation Predi ction 2019/5/24 TUT20 part2: Deep Learning in WLANs 4 [Okamoto18] [Nishio19+]

Vision and Deep Learning based Received Power Prediction Red line：Received
power predicted 0.5s before the time Blue line：Measured received power 5 2019/5/24 TUT20 part2: Deep Learning in WLANs Deep Neural Network Computer vision Received power prediction [Okamoto18] [Nishio18]

What is machine learning (ML)? Mathematical tool to obtain a
mapping from input information we have to output information we want to obtain 2019/5/24 TUT20 part2: Deep Learning in WLANs 6 Input Output Mapping RSSI, CSI, BER, Throughput, Packet loss BSSID, Location Location, Future received power, Future throughput, Future QoS Available spectral band, estimated from data Deep learning is one of ML framework, which is a deep stack of multiple mapping functions Example in wireless

When we need it? Handling complex and uncertain issues in
wireless, which is difficult to be modeled analytically. 2019/5/24 TUT20 part2: Deep Learning in WLANs 7 ML is good idea when • We have no (accurate) model, but intuitively there is huge room for improving. • System model is far from ideal case, but we have lots of data. (not Gaussian or stationary, many black-boxes)

Machine Learning in/for Wireless LANs 2019/5/24 TUT20 part2: Deep Learning
in WLANs 8 How do we leverage ML for wireless? How do we train ML models in wireless? In this talk, I try to answer the following two question.

Flow of ML 1. Define problem as ML task 2.
Prepare dataset 3. Select and tune ML algorithm 4. Do training and validation → Obtaining good result：Done! → If not, back to step 2 or 3 2019/5/24 TUT20 part2: Deep Learning in WLANs 9

Prepare dataset 3. Select and tune ML algorithm 4. Do training and validation → Obtaining good result：Done! → If not, back to step 2 or 3 2019/5/24 TUT20 part2: Deep Learning in WLANs 10 For doing these steps, we require to know the basics of ML We need knowhow

Basics of ML 2019/5/24 TUT20 part2: Deep Learning in WLANs
11

Category of ML tasks Unsupervised learning which extracts knowledge from
data Applications • Signal/primary user detection [Kapoor15] • More are summarized in [Usama17] Algorithms • Deep learning • Variational autoencoder • Generative adversarial network (GAN) Reinforcement learning (RL) which obtains an optimal strategy from try and error Application • Handover control [Dhahri12, Koda18] • Channel allocations [Naparstek17, Deng19] Algorithms • Sarsa • Q learning • Deep RL Supervised learning which estimates a mapping from input features to output label Applications • Predicting throughput /received power [Okamoto17, Nishio18] • Wireless protocol estimation [Hu14] Algorithms • Random forest • Support vector machine • Deep learning 12 2019/5/24 TUT20 part2: Deep Learning in WLANs

Survey papers of ML applications in wireless 1. K.-L. Alvin
Yau, P. Komisarczuk, and P. D. Teal, “Reinforcement learning for context awareness and intelligence in wireless networks: Review, new features and open issues,” J. Netw. Comput. Appl., vol. 35, no. 1, pp. 253–267, 2011. 2. T. T. T. Nguyen and G. Armitage, “A survey of techniques for internet traffic classification using machine learning,” Commun. Surv. Tutorials, IEEE, vol. 10, no. 4, pp. 56–76, 2008. 3. L. Gavrilovska, V. Atanasovski, I. Macaluso, and L. A. Dasilva, “Learning and reasoning in cognitive radio networks,” IEEE Commun. Surv. Tutorials, vol. 15, no. 4, pp. 1761–1777, 2013. 4. M. Bkassiny, Y. Li, and S. K. Jayaweera, “A Survey on Machine-Learning Techniques in Cognitive Radios,” IEEE Commun. Surv. Tutorials, vol. 15, no. 3, pp. 1136–1159, Jan. 2013. 5. P. V. Klaine, M. A. Imran, O. Onireti, and R. D. Souza, “A Survey of Machine Learning Techniques Applied to Self Organizing Cellular Networks,” IEEE Commun. Surv. Tutorials, vol. 19, no. 4, pp. 1–1, 2017. 2019/5/24 TUT20 part2: Deep Learning in WLANs 13

Supervised learning task A problem estimating a mapping ⋅ from
Input (feature) x to output (label) y with training data = , ) )*+ , 14 = ; , where ⋅ is parameterized by . Mathematically, supervised learning task is optimization problem minimizing error between ) and ; 2019/5/24 TUT20 part2: Deep Learning in WLANs Regression Classification Supervised learning task: * In the following slides, ; is simply described as

Regression task A task estimating ⋅ which becomes = from
training data For example; polynomial regression Ground truth Training data Estimated min 1 2 4 )*+ , ) − ) 7 , s. t. ( ) = > + + + 77 + @@ + ⋯ + BB 15 2019/5/24 TUT20 part2: Deep Learning in WLANs

Classification task A task estimating a mapping from x to
a class y, where the x belongs to, based on training data = , ) )*+ , Binary classification classifies input data x into a binary class Multiclass classification classifies input data into a class among multiple classes. Dog Other x1 , x2 , x4 , … x3 , x5 , … Dog Cat Bird Other x1 , x2 x6 x4 x3 , x5 16 2019/5/24 TUT20 part2: Deep Learning in WLANs

Mathematical expression of (binary) classification problem Define binary classes as
yi ∈{1,-1} Let classification function g ⋅ be Feature x1 Feature x2 Feature x 3 Plain f(x) = 0 Class 1 Class 2 = E 1 > 0, −1 < 0. 17 The problem becomes a task estimating ⋅ that outputs a positive and negative value for belonging class 1 and class 2, respectively. Here, = 0 is a super plain dividing data belonging each class 2019/5/24 TUT20 part2: Deep Learning in WLANs

Difference between regression and classification Almost same task, which estimates
⋅ from dataset Regression is direct estimation of ⋅ minimizing error between y and Classification is estimation of ⋅ with a classification function g ⋅ Most ML algorithm can be applied to regression and classification tasks. 18 2019/5/24 TUT20 part2: Deep Learning in WLANs What is different among ML algorithms? • Definition of error between y and ; Loss function • Model of ⋅ • How solve the error minimization problem

Loss functions For regression task Squared error , = −
7 Absolute error , = | − | For classification task • 0-1 loss function , = E 1, () < 0 0, ≥ 0 • Hinge loss function , = max 0, 1 − • Logistic loss function , = log(1 + TUV()) TUT20 part2: Deep Learning in WLANs 2019/5/24 19

Model of ⋅ • Tree topology 2019/5/24 TUT20 part2: Deep
Learning in WLANs 20 Input Output • Linear model = + • Neural Network

How solve the problem? For example, let be = +
, learning task is to find optimal parameters of w, b which minimizes sum of loss function, ∑)∈[[] ) , TUT20 part2: Deep Learning in WLANs 2019/5/24 21 min , 4 )∈[[] ) − () + ) 7 When using squared error function, the problem can be expressed as

Basics of Deep learning 2019/5/24 TUT20 part2: Deep Learning in
WLANs 22

Deep learning • Deep stack of neural networks (NNs) •
Hinton et al., made a breakthrough by DBN (Deep Belief Network) which achieves much higher accuracy than conventional ML algorithms [Hinton06]. • Various neural network and many techniques for training deep neural networks (DNNs) are developed day by day 23 Input: feature vector Output Shallow neural network Input: feature vector Output Deep neural network 2019/5/24 TUT20 part2: Deep Learning in WLANs

Neural network（1/2） TUT20 part2: Deep Learning in WLANs 2019/5/24 24
ML algorithm using a simplified model of a biological neural network A signal inputted to a neuron is denoted as = + An output in response to input signal is denoted as y = () represents activation function. Activation function is a simplified model of ignition output of a neuron. Non-linear activation function enables NN to learn non-linear model. Input: Feature vector Output y=g(u) x1 x2 x3 w3 w2 w1 u Neuron Activation function: Sigmoid func. = 1 1 + T`

Neural network（2/2） 25 Neural network is a network of neurons.
Input: Features Output Input layer Output layer Hidden layer Input layer receives feature vector and forward it to the next hidden layer. Output layer integrates signals propagated in NN into a certain format and outputs it. This layer should be designed in accordance with a task we solve. Hidden layer learns mapping from the input to the output. Layers close to input layer may perform feature extraction. 2019/5/24 TUT20 part2: Deep Learning in WLANs

Basic structure of output layer Regression 2019/5/24 TUT20 part2: Deep
Learning in WLANs 26 form previous layer output: y=u Classification Linear mapping is used for activation Squared loss or absolute loss func. is used for loss function. Input: Feature vector softmax func. for activation Prob. class A Softmax Prob. class B Prob. class C Cross entropy is used for loss func.

Training of neural network is almost same as that conventional
ML algorithms defining loss function , ), and solving min ∑)∈[[] , ), by gradient descent 2019/5/24 TUT20 part2: Deep Learning in WLANs 27 Gradient descent calculating gradient of loss function for = , )*+ , and update w as , ), = ) = ) + , … , ) B g = − 4 (j,j)∈k ) • Large computation and memory cost to calculate gradient • Gradient could be zero and NN cannot be updated (vanishing gradient problem)

Gradient decent method for neural network Back propagation a method
to calculate gradient in NNs with smaller computation cost. 28 Mini batch training is a kind of stochastic gradient decent. Updating weight using gradient calculated from a portion of training data ’ • Less memory consumption • Less risk of vanishing gradient • Accelerating computation by parallel computing 2019/5/24 TUT20 part2: Deep Learning in WLANs

Procedure of mini-batch training 1. Dataset is separated into training
data and validation data 2. Generate minibatches from training data 3. For each minibatch 1. calculate gradient for each datum by backpropagation method 2. update weight with gradient 4. Evaluate model performance with validation data 5. Repeat step 2,4 (which is called an epoch) until the model achieves required performance. If validation loss does not decrease, you should rebuild neural network or tune hyperparameters. 29 2019/5/24 TUT20 part2: Deep Learning in WLANs

Implementation: All you need is deep learning framework DL framework
is a library of models and algorithms (e.g., backpropagation and activation function) related to deep learning. You need to program 1. preprocessing of data 2. definition of model 3. setting of loss function and optimizer 4. operation of training 〜〜Tea time〜〜 Trained NN is obtained. 2019/5/24 TUT20 part2: Deep Learning in WLANs 30 Code example (Chainer)

ML for Wireless Networks 2019/5/24 TUT20 part2: Deep Learning in
WLANs 31

Prepare dataset 3. Select and tune ML algorithm 4. Do training and validation → Obtaining good result：Done! → If not, back to step 2 or 3 2019/5/24 TUT20 part2: Deep Learning in WLANs 32

Where can we apply machine learning in wireless system? •
NW state prediction obtaining information required for NW control from measurable information • NW operation controlling network parameters to improve QoS/QoE. Measured information Received power, throughput, packet loss ratio, delay, RTT, CSI etc.. Prediction of NW state Optimization or heuristics based NW operation Optimal NW operation Conventional approach End-to-End Learning based NW operation Start Learning based prediction Learning based NW operation ML based approach 33 2019/5/24 TUT20 part2: Deep Learning in WLANs

Defining ML tasks NW state prediction -> Classification or regression
problem • Future throughput or received power prediction (regression) • Primary user detection in cognitive radio (binary classification) • Media access protocol estimation (multiclass classification) NW operation -> Reinforcement learning or Classification or regression + Optimization/Heuristic • Vertical handover control • Routing 34 2019/5/24 TUT20 part2: Deep Learning in WLANs

Preparing dataset We can make a system in which labeled
dataset can be obtained. For example, for received power prediction task, received power can be measured frame by frame and dataset can be generated from the measurement. Data preprocessing Data should be formatted and scaled in accordance with ML algorithm. Denoising and resampling are required sometime. 35 2019/5/24 TUT20 part2: Deep Learning in WLANs

Selecting/designing ML algorithm 36 2019/5/24 TUT20 part2: Deep Learning in
WLANs 2x2x2 3D Average Pooling 3x3 ConvLSTM, 64 3x3 ConvLSTM, 64 2x2 2D Average Pooling Flattening Fully Connected, 512 Fully Connected, 1 1x3x3 3D Convolution, 64 ReLU 1x3x3 3D Convolution, 128 Batch Normalization NNs allow us to design various structure, for example, activation function, num of unit, depth of network… It’s like LEGO block. Example of NN Non-NN ML algorithms still work very well. • Support vector machine (SVM) • Random Forest • Gradient boosting (XGBoost, LightGBM)

NN structure Dense Network (Full Connected) Units are fully connected
to units in the next layer 37 Recurrent Neural Network storing memory of inputs. It works well for time series data and natural language which has context. 2019/5/24 TUT20 part2: Deep Learning in WLANs Convolutional Neural Network Neural Network using convolution layer, which works well for features where adjacent units have relationship.

Convolutional Neural Network (CNN) CNN tries to learn filters which
activate when certain pattern appears in input data. 38 2019/5/24 TUT20 part2: Deep Learning in WLANs

Example of CNN filter operation 39 参考：A Beginner's Guide To
Understanding Convolutional Neural Networks https://adeshpande3.github.io/adeshpande3.gi thub.io/A-Beginner%27s-Guide-To- Understanding-Convolutional-Neural- Networks/ Pixel values Filter When the filter is applied to other part … Activated Not activated Pattern of active and non-active may describes object you want to find 2019/5/24 TUT20 part2: Deep Learning in WLANs

How do we design NN structure? Finding optimal structure is
difficult… However, there are knowhow. For image-like data, CNN works well. For time series data, RNN works well. 42 2019/5/24 TUT20 part2: Deep Learning in WLANs Raw data Fully connected NN RNN CNN 3D-CNN CNN-based RNN Image-like data (Time) series data Time series of image-like data Graph data Graph convolutional network (GCN) Transform

Application of DL in wireless 2019/5/24 TUT20 part2: Deep Learning
in WLANs 43

mmWave received power prediction based on ML and camera 2019/5/24
TUT20 part2: Deep Learning in WLANs 44 [Okamoto17 ] H. Okamoto, T. Nishio, M. Morikura, K. Yamamoto, D. Murayama, and K. Nakahira, “Machine-Learning-Based Throughput Estimation Using Images for mmWave Communications,” IEEE VTC-Spring, June 2017. [Okamot18] H. Okamoto, T. Nishio, M. Morikura, and K. Yamamoto, "Recurrent neural network-based received signal strength estimation using depth images for mmWave communications," IEEE CCNC 2018, Jan. 2018. [Nishio19+] T. Nishio, H. Okamoto, K. Nakashima, Y. Koda, K. Yamamoto, M. Morikura, Y. Asai, and R. Miyatake, “Machine- learning-based future received signal strength prediction using depth images for mmWave communications,” IEEE JSAC Machine Learning in Wireless Communication, Oct. 2019. (preprint is available from arXiv:1803.09698)

mmWave received power prediction based on ML and camera Received
power in 0.5s ahead is accurately predicted LOS path blockage Strong signal attenuation occur Human blockage Key idea: ML and camera Based on vision information, blockage is predicted. Transmitter Receiver Future received power Deep Neural Network we need to predict timing and strength of the attenuation Predi ction 2019/5/24 TUT20 part2: Deep Learning in WLANs 45 [Okamoto18] [Nishio19+]

mmWave WLANs 46 Path 1 was blocked Path 1 was
blocked 0 100 200 300 400 500 600 0 5 10 15 20 25 30 35 40 45 50 Throughput of path 1 (Mbit/s) Time (s) Path 1 IEEE 802.11ad/ay • Beyond Gbit/s communications using bandwidth of 2.16 GHz or more. • Strong attenuation (15 dB ~) when human blocks line of sight (LOS) mmWave AP Experimental result using IEEE 802.11ad devices 2019/5/24 TUT20 part2: Deep Learning in WLANs Computer vision helps predicting human blockage! [Okamoto18] [Nishio19+]

We have… 2019/5/24 TUT20 part2: Deep Learning in WLANs 47
Images ? Intuition we may be able to predict received power from images. Received power No model [Okamoto18] [Nishio19+]

ML helps us! 2019/5/24 TUT20 part2: Deep Learning in WLANs
48 Images Intuition we may be able to predict received power from images. Received power Machine Learning! [Okamoto18] [Nishio19+]

Deep learning enabled future received power prediction from depth camera
imagery (1/3) We designed a regression problem in which input is time series of imagery and output is future received power 49 Time series of imagery Output (Label) Received power at T + 15 frame … t T : current T -15 T -14 T -13 … T +15 2019/5/24 TUT20 part2: Deep Learning in WLANs Input (Feature) [Okamoto18] [Nishio19+]

- 70 dBm Deep learning enabled future received power prediction
from depth camera imagery (2/3) 50 Example of Training dataset 2019/5/24 TUT20 part2: Deep Learning in WLANs - 55 dBm Labels Features 2x2x2 3D Average Pooling 3x3 ConvLSTM, 64 3x3 ConvLSTM, 64 2x2 2D Average Pooling Flattening Fully Connected, 512 Fully Connected, 1 1x3x3 3D Convolution, 64 ReLU 1x3x3 3D Convolution, 128 Batch Normalization Training

2019/5/24 TUT20 part2: Deep Learning in WLANs 51 2x2x2 3D
Average Pooling 3x3 ConvLSTM, 64 3x3 ConvLSTM, 64 2x2 2D Average Pooling Flattening Fully Connected, 512 Fully Connected, 1 1x3x3 3D Convolution, 64 ReLU 1x3x3 3D Convolution, 128 Batch Normalization Convolutional LSTM Network (ConvLSTM) [Xingjian15] expected to capture spatio-temporal features Convolutional Neural Network (Conv, CNN) x LSTM y Long Short-Term Memory (LSTM) • NN has recurrent architecture, which is referred to as RNN. • Output of LSTM is inputted to the LSTM unit at the next step • Convolution unit captures spatial features consists of adjacent pixels. Deep learning enabled future received power prediction from depth camera imagery (3/3) [Okamoto18] [Nishio19+]

Experimental setup 2019/5/24 TUT20 part2: Deep Learning in WLANs 52
mmWave networks: • IEEE 802.11ad • Channel: 62.48 GHz • PHY rate: Auto (385~4620 Mbit/s) AP • Dell WiGig Docking station STA • WiGig-equipped laptop (Dell) Pedestrians move with a random speed and block the LOS path every 6 s NN was trained with a dataset of 10 minutes measurement [Nishio19+]

Procedure of mini-batch training 1. Dataset is separated into training
data and validation data 2. Generate minibatches from training data 3. For each minibatch 1. calculate gradient for each datum by backpropagation method 2. update weight with gradient 4. Evaluate model performance with validation data 5. Repeat step 2,4 (which is called an epoch) until the model achieves required performance. If validation loss does not decrease, you should rebuild neural network or tune hyperparameters. 53 2019/5/24 TUT20 part2: Deep Learning in WLANs

For avoiding unintended cheating (1/2) We should be careful when
preprocessing dataset. We should avoid • Using a dataset used in validation for performance evaluation The model tuned using a validation data could overfit to the data. To remove the bias, data for evaluation should be excluded from model training. 2019/5/24 TUT20 part2: Deep Learning in WLANs 54 For training For validation For performance eval Use only for eval.

Unavailable in practice For avoiding unintended cheating (2/2) •Data leakage
which uses unavailable data for training in practice. 2019/5/24 TUT20 part2: Deep Learning in WLANs 55 Usable data Moment of prediction Time Data leakage could occur when sampling training and test data randomly from time series data. The model trained by the data cloud have knowledge of future trends of prediction target.

Experimental results Red line：Received power predicted 0.5s before the time
Blue line：Measured received power 56 2019/5/24 TUT20 part2: Deep Learning in WLANs [Nishio19+]

ML in Wireless Networks 2019/5/24 TUT20 part2: Deep Learning in
WLANs 57

Issues and Key techniques Key techniques Transfer Learning technique for
reducing computation time and required amount of data for training. Federated Learning technique for training a model with data and computation resource of mobile users 2019/5/24 TUT20 part2: Deep Learning in WLANs 58 Issues •Large computation required for training deep NNs •Preparing sufficient amount of training data

Transfer learning 2019/5/24 TUT20 part2: Deep Learning in WLANs 59
For reducing training cost (computation time, required amount of data), leveraging a model trained by certain dataset to other similar task. Pretrained ResNet50 is available in Keras keras.applications.resnet50.ResNet50 Input: Features Pretrained NN Input: Features Transfer (Copy) Target NN Finetuning: training the target NN with a small training data of target task.

Reducing training cost based on transfer learning in mmWave received
power prediction 2019/5/24 TUT20 part2: Deep Learning in WLANs 60 [Mikuma19] [Mikuma19] T. Mikuma, T. Nishio, M. Morikura, K. Yamamoto, Y. Asai, and R. Miyatake, "Transfer Learning-Based Received Power Prediction Using RGB-D camera in mmWave Networks," Proc. IEEE VTC Spring, May 2019.

Received power prediction 61 Future received power Deep Neural Network
Problem: Training an accurate prediction model requires large amount of data and computation. 18000 data samples 4 hours computation by GPU server 2019/5/24 TUT20 part2: Deep Learning in WLANs [Mikuma19]

Transfer learning based received power prediction 62 BS STA Ray-tracing
simulation 3D CG Training NN model 3D spatial model Pre-Train Server Prediction Unit Measured received power Measured depth images Initializing NN with pretrained NN NN fit to the measured data Pretraining model with simulation data Transfer Finetuning with measured data in deployed Accurate prediction model 2019/5/24 TUT20 part2: Deep Learning in WLANs [Mikuma19]

Simulation setup for pretraining 63 Camera STA AP Wall Wall
Passage Pedestrian Simulated Depth Image 4 m 10 m LOS path pedestrian pedestrian Camera (height 2.25 m) STA (height 1.25 m) AP (height 2.25 m) Images are generated by 3D CG generator and received power is calculated by raytracing simulator 2019/5/24 TUT20 part2: Deep Learning in WLANs [Mikuma19]

Experimental setup for finetuning and performance test 2019/5/24 TUT20 part2:
Deep Learning in WLANs 64 mmWave networks: • IEEE 802.11ad • Channel: 62.48 GHz • PHY rate: Auto (385~4620 Mbit/s) AP • Dell WiGig Docking station STA • WiGig-equipped laptop (Dell) Pedestrians move with a random speed and block the LOS path every 6 s NN was trained with a portion of dataset of 10 minutes measurement [Mikuma19]

RMSE evaluation 65 At earlier epochs, the proposed scheme achieves
a higher prediction accuracy than the previous scheme w/o transfer learning. When NN was finetuned with dataset of 8 minutes measurement When NN was finetuned with dataset of 1 minutes measurement 2019/5/24 TUT20 part2: Deep Learning in WLANs [Mikuma19]

Federated learning with heterogeneous nodes in wireless networks 2019/5/24 TUT20
part2: Deep Learning in WLANs 66 [Nishio19] [Nishio19] T. Nishio and R. Yonetani, "Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge," Proc. IEEE ICC 2019, May 2019.

Federated Learning (FL) [McMahan17] Machine learning framework using clientsʼ data
and computation resource while the raw data are stored in local storage. FL flow 2019/5/24 67 TUT20 part2: Deep Learning in WLANs 1. Distributing model parameters 3. Uploading the new parameters 2. Updating the model with own data Server Coordinator 4. Aggregating client updates Network Edge Clients WIreless Network [Nishio19]

Distributed Mechanism for Machine Learning Sharing information for model update
2019/5/24 68 • Gradient [Mnih16][Jianmin16] • Weights of updated model [McMahan17] • Data or Experience [Horgan] Knowledge Knowledge TUT20 part2: Deep Learning in WLANs

FL in Mobile 69 Conventional FL does not care difference
of wireless links quality and computation powers of clients in wireless networks. Clients (User equipments) MEC platform Base station Server Wireless links (Cellular network) Bandwidth limitation Resource constraints Computation power limitation Time limitation FL in Mobile has resource constraints. • Heterogeneous clients • Heterogeneous resources Some clients could be bottoleneck of the FL due to heterogeneity of clients’ resources. 2019/5/24 ・Computation capability ・Throughput ・Data amount ・Data quality TUT20 part2: Deep Learning in WLANs [Nishio19]

Whatʼs different? Distributed ML 2019/5/24 70 FL Database Server Data
and computation task are assigned to distributed computing devices. Server Data is on server. Data is on distributed devices. Base model is distributed to mobile devices and mobile devices update it with their data. TUT20 part2: Deep Learning in WLANs [Nishio19]

Federated Learning for Mobile Networks 71 Federated Learning with Client
Selection (FedCS) To ensure the number of rounds, a duration for a round is fixed. In the duration, clients are selected so as to maximize the number of clients. selects clients to participate FL in order to reduce the required time for training the model to achieve certain performance. Model performance can increase with increasing 1. the number of FL rounds 2. the number of clients a. the amount of data used for FL b. data variation 2019/5/24 TUT20 part2: Deep Learning in WLANs [Nishio19]

Experimental Evaluation 72 CIFAR10 Fashion MNIST Task: Image Classification Client’s
computation power: 10-100 images/sec Wireless setting is based on LTE model Convolution Neural Network - 6 convolution layers - 3 fully connected layers Cell radius 2km 1000 Clients 2019/5/24 TUT20 part2: Deep Learning in WLANs [Nishio19]

Accuracy Accuracy vs. Elapsed Time in Simulation 73 FedCS reduces
the training time to achieve high accuracy by 50-70% even when the clients’ throughput estimation are not accurate. CIFAR-10 2019/5/24 Fashion-MNIST [Nishio19] TUT20 part2: Deep Learning in WLANs

Conclusion Machine learning is still a hot topic in the
research area of wireless communications. It becomes easy to use machine learning owing to DL framework and many libraries. My opinion • I expect that wireless NW operation will gradually shift to data-driven operation empowered by ML while model-driven operations still exist. • Finding out a good problem is difficult, but important. I expect the good problem exists in new challenge which is never solved by conventional approach. 74 2019/5/24 TUT20 part2: Deep Learning in WLANs

You can get the latest slides here! 2019/5/24 TUT20 part2:
Deep Learning in WLANs 75 https://www.imc.cce.i.kyoto-u.ac.jp/icc2019/

Textbooks Theory • Kevin P. Murphy. Machine learning: a probabilistic
perspective. MIT press, 2012. • Christopher M. Bishop, Pattern recognition and machine learning. springer, 2006. • I. Goodfellow, Y. Bengio, A. Courville, Deep learning. Cambridge: MIT press, 2016. More practical • Andreas C. Müller, and Sarah Guido. Introduction to machine learning with Python: a guide for data scientists. O'Reilly Media, 2016. • Lapan, Maxim. Deep Reinforcement Learning Hands-On: Apply modern RL methods, with deep Q- networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more. Packt Publishing. 2018. TUT20 part2: Deep Learning in WLANs 2019/5/24 76

Other information source Twitter •Ian Goodfellow @goodfellow_ian Google Brain research
scientist leading a team studying adversarial techniques in AI •Daisuke Okanohara @hillbig Co-founder and EVP of Preferred Networks (PFN). 2019/5/24 TUT20 part2: Deep Learning in WLANs 77

My recent works can be found in https://scholar.google.co.jp/citations?hl=ja&user=hHnMMMkAAAAJ&vie w_op=list_works&sortby=pubdate Most
of them are related to machine learning. 2019/5/24 TUT20 part2: Deep Learning in WLANs 78

Reference 1 [Okamoto17 ] H. Okamoto, T. Nishio, M. Morikura,
K. Yamamoto, D. Murayama, and K. Nakahira, “Machine-Learning-Based Throughput Estimation Using Images for mmWave Communications,” in Proc. IEEE Vehicular Technology Conference, 2017, vol. 2017–June. [Okamot18] H. Okamoto, T. Nishio, M. Morikura, and K. Yamamoto, "Recurrent neural network-based received signal strength estimation using depth images for mmWave communications," IEEE CCNC 2018, Jan. 2018. [Nishio19+] T. Nishio, H. Okamoto, K. Nakashima, Y. Koda, K. Yamamoto, M. Morikura, Y. Asai, and R. Miyatake, “Machine-learning-based future received signal strength prediction using depth images for mmWave communications,” IEEE JSAC Machine Learning in Wireless Communication, Oct. 2019. (preprint is available from arXiv:1803.09698) [Mikuma19] T. Mikuma, T. Nishio, M. Morikura, K. Yamamoto, Y. Asai, and R. Miyatake, "Transfer Learning-Based Received Power Prediction Using RGB-D camera in mmWave Networks," Proc. IEEE VTC Spring, May 2019. [Nishio19] T. Nishio and R. Yonetani, "Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge," Proc. IEEE ICC 2019, May 2019. 2019/5/24 TUT20 part2: Deep Learning in WLANs 79

Reference 2 [Lin09] P. Lin and T. Lin, “Machine-learning-based adaptive
approach for frame-size optimization in wireless LAN environments,” IEEE Trans. Veh. Technol., vol. 58, no. 9, pp. 5060–5073, 2009. [Kurniawan] E. Kurniawan, L. Zhiwei, and S. Sun, “Machine Learning-based Channel Classification and Its Application to IEEE 802 . 11ad Communications,” in Proc. IEEE Global Telecommunications Conference, GLOBECOM 2017, 2017. [Naparstek17] O. Naparstek and K. Cohen, “Deep Multi-User Reinforcement Learning for Dynamic Spectrum Access in Multichannel Wireless Networks,” in Proc. IEEE Global Telecommunications Conference, GLOBECOM 2017, 2017. [Dhahri12] C. Dhahri and T. Ohtsuki, “Q-Learning Cell Selection for Femtocell Networks :,” in Proc. IEEE Globecom 2012, pp. 4975–4980, 2012. [Koda18] Y. Koda, K. Yamamoto, T. Nishio, M. Morikura, “Reinforcement Learning Based Predictive Handover for Pedestrian-Aware mmWave Networks,” in Proc. IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp.1-6, Honolulu, HI, USA, April 16-19, 2018. 2019/5/24 TUT20 part2: Deep Learning in WLANs 80

2019/5/24 TUT20 part2: Deep Learning in WLANs 81 [Hu14] S.
Hu, Y. Yao, and Z. Yang, “MAC protocol identification using support vector machines for cognitive radio networks,” IEEE Wirel. Commun., no. February, pp. 52–60, 2014. [Deng19] W. Deng, S. Kamiya, K. Yamamoto, T. Nishio, M. Morikura, "Replica Exchange Spatial Adaptive Play for Channel Allocation in Cognitive Radio Networks," Proc. IEEE VTC Spring, May 2019. [Xingjian15] Xingjian, S. H. I., et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." in Proc. Advances in neural information processing systems. 2015. [Zhu17] G. Zhu, L. Zhang, P. Shen and J. Song, "Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM," IEEE Access, vol. 5, pp. 4517-4524, 2017. [Kapoor15] G. Kapoor and K. Rajawat, “Outlier-aware cooperative spectrum sens- ing in cognitive radio networks,” Physical Communication, vol. 17, pp. 118–127, 2015. [Usama17] M. Usama, J. Qadir, A. Raza, H. Arif, K.-L. A. Yau, Y. Elkhatib, A. Hussain, and A. Al- Fuqaha, “Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges,” pp. 1–37, 2017. Reference 3

2019/5/24 TUT20 part2: Deep Learning in WLANs 82 [Alvin11] K.-L.
Alvin Yau, P. Komisarczuk, and P. D. Teal, “Reinforcement learning for context awareness and intelligence in wireless networks: Review, new features and open issues,” J. Netw. Comput. Appl., vol. 35, no. 1, pp. 253–267, 2011. [Nguyen08] T. T. T. Nguyen and G. Armitage, “A survey of techniques for internet traﬃc classiﬁcation using machine learning,” Commun. Surv. Tutorials, IEEE, vol. 10, no. 4, pp. 56–76, 2008. [Gavrilovska13] L. Gavrilovska, V. Atanasovski, I. Macaluso, and L. A. Dasilva, “Learning and reasoning in cognitive radio networks,” IEEE Commun. Surv. Tutorials, vol. 15, no. 4, pp. 1761–1777, 2013. [Bkassiny13] M. Bkassiny, Y. Li, and S. K. Jayaweera, “A Survey on Machine-Learning Techniques in Cognitive Radios,” IEEE Commun. Surv. Tutorials, vol. 15, no. 3, pp. 1136–1159, Jan. 2013. [Klaine17] P. V. Klaine, M. A. Imran, O. Onireti, and R. D. Souza, “A Survey of Machine Learning Techniques Applied to Self Organizing Cellular Networks,” IEEE Commun. Surv. Tutorials, vol. 19, no. 4, pp. 1–1, 2017. Reference 4

2019/5/24 TUT20 part2: Deep Learning in WLANs 83 [Mnih16] Mnih,
Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. 2016. [Horgan16] Horgan, Dan, et al. "Distributed prioritized experience replay." arXiv preprint arXiv:1803.00933 (2018). [Jianmin16] Jianmin Chen, Rajat Monga, Samy Bengio, and Rafa Jozefowicz, “Revisiting distributed synchronous sgd,” In ICLR Workshop Track , 2016. [McMahan17] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Proceedings of the International Conference on Artificial Intelligence and Statistics, Apr. 2017. Reference 5

Tutorial @ IEEE ICC 2019 : Machine Learning and...

Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statistical Frameworks Against Uncertainty in Wireless LANs

Other Decks in Research

Featured

Transcript