eRum2018: Tools for using TensorFlow with R

Tools for using TensorFlow eRum 2018-05-15 Andrie de Vries Solutions
Engineer, RStudio @RevoAndrie 1

Overview • Cloud GPU • R packages • tfruns •
Cloudml • Resources 2 StackOverflow: andrie Twitter: @RevoAndrie GitHub: andrie

Motivating example • Chest accelerometer data • UCI machine learning
3 Activity Recognition from Single Chest-Mounted Accelerometer Data Set Example inspired from work by Mango Solutions

Motivating example • Training data: • Samples of people walking
• Acceleration in x, y and z direction 4 • Task • From the data, predict the person • 15 different people

Starting model 5 keras_model_sequential() %>% layer_conv_1d( filters = 40, kernel_size
= 30, strides = 2, activation = "relu", input_shape = c(260, 3) ) %>% layer_max_pooling_1d(pool_size = 2) %>% layer_conv_1d( filters = 40, kernel_size = 10, activation = "relu" ) %>% layer_max_pooling_1d(pool_size = 2) %>% layer_flatten() %>% layer_dense(units = 100, activation = "sigmoid") %>% layer_dense(units = 15, activation = "softmax") 1d convolution Useful for time series data and other sequences, including text ~95% on validation set

TensorFlow using R 6

Why should R users care about TensorFlow? • A new
general purpose numerical computing library • Not all data has to be in RAM • Highly general optimization, e.g. SGD, Adam • TensorFlow models can be deployed with C++ runtime • R has a lot to offer as an interface language 7

TensorFlow APIs • Distinct interfaces for various tasks and levels
of abstraction 8

GPUs 9

GPUs • Deep neural networks perform best on complex perceptual
problems: • Very high capacity models • Trained on large amounts of data • This means you need: • Large computational capacity (GPU / TPU) • Time • Some applications are excruciatingly slow on CPU • In particular, image processing (convolutional networks) and sequence processing (recurrent neural networks) 10

How much faster are GPU machines? • Many factors determine
GPU performance • Complexity of the model • Mini-batch size • Convolution • Speed of getting data into GPU (input/output) • For recurrent tasks, GPU may not offer any advantage on CPU • For convolution, GPU and TPU can make a big difference 11

GPUs 12 https://tensorflow.rstudio.com/tools/gpu/ Local GPU Cloud ML Cloud Server Cloud
Desktop

Paperspace 13 https://tensorflow.rstudio.com/tools/gpu/ Local GPU Cloud ML Cloud Server Cloud
Desktop Paperspace

Paperspace • https://www.paperspace.com/account/signup 14

Google Cloud Machine Learning 15 https://tensorflow.rstudio.com/tools/gpu/ Local GPU Cloud ML
Cloud Server Cloud Desktop Cloud ML

Google Cloud Machine Learning 16

Packages 17

TensorFlow tasks 18 Training Hyperparameter tuning Paperspace / AWS Cloud
ML

R packages 19 Training Hyperparameter tuning Paperspace / AWS Cloud
ML tfruns cloudml

TensorFlow tasks 20 Training Hyperparameter tuning Paperspace / AWS Cloud
ML tfruns:: training_run() cloudml tfruns:: tuning_run() cloudml

21 Training Tuning Paperspace / AWS Cloud ML tfruns:: training_run()
cloudml The tfruns package

tfruns • https://tensorflow.rstudio.com/tools/tfruns/ • Successful deep learning requires a huge
amount of experimentation. • This requires a systematic approach to conducting and tracking the results of experiments. • The training_run() function is like the source() function, but it automatically tracks and records output and metadata for the execution of the script 22

tfruns::training_run() 23 library(tfruns) training_run("walking_experiments.R") view_run()

tfruns::ls_runs() 24 # A tibble: 131 x 38 run_dir eval_loss
eval_acc metric_loss metric_acc metric_val_loss metric_val_acc * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 runs/2018-05-15T0… 0.189 0.951 0.0125 0.998 0.181 0.951 2 runs/2018-05-15T0… 0.195 0.938 0.0245 0.992 0.175 0.948 3 runs/2018-05-14T2… 0.192 0.941 0.0965 0.967 0.143 0.950 4 runs/2018-05-14T2… 0.189 0.938 0.231 0.921 0.159 0.945 5 runs/2018-05-14T2… 0.194 0.936 0.250 0.904 0.168 0.940 6 runs/2018-05-14T2… 0.286 0.882 0.482 0.806 0.263 0.891 7 runs/2018-05-14T2… 0.402 0.853 0.377 0.851 0.351 0.867 8 runs/2018-05-14T2… 0.265 0.893 0.291 0.884 0.230 0.911 9 runs/2018-05-14T2… 0.139 0.960 0.0308 0.989 0.108 0.970 10 runs/2018-05-14T2… 0.403 0.873 0.670 0.749 0.372 0.886 # ... with 121 more rows, and 31 more variables: flag_conv_1_filters <int>, # flag_conv_1_kernel <int>, flag_conv_1_pooling <int>, flag_conv_1_dropout <dbl>, # flag_conv_2_filters <int>, flag_conv_2_kernel <int>, flag_conv_2_pooling <int>, # flag_conv_2_dropout <dbl>, flag_dense_1_nodes <int>, flag_dense_1_dropout <dbl>, # flag_dense_2_nodes <int>, flag_dense_2_dropout <dbl>, flag_mini_batch_size <int>, ls_runs() %>% as_tibble()

tfruns::ls_runs() 25 # A tibble: 131 x 38 run_dir eval_loss
eval_acc metric_loss metric_acc metric_val_loss metric_val_acc * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 runs/2018-05-15T0… 0.189 0.951 0.0125 0.998 0.181 0.951 2 runs/2018-05-15T0… 0.195 0.938 0.0245 0.992 0.175 0.948 3 runs/2018-05-14T2… 0.192 0.941 0.0965 0.967 0.143 0.950 4 runs/2018-05-14T2… 0.189 0.938 0.231 0.921 0.159 0.945 5 runs/2018-05-14T2… 0.194 0.936 0.250 0.904 0.168 0.940 6 runs/2018-05-14T2… 0.286 0.882 0.482 0.806 0.263 0.891 7 runs/2018-05-14T2… 0.402 0.853 0.377 0.851 0.351 0.867 8 runs/2018-05-14T2… 0.265 0.893 0.291 0.884 0.230 0.911 9 runs/2018-05-14T2… 0.139 0.960 0.0308 0.989 0.108 0.970 10 runs/2018-05-14T2… 0.403 0.873 0.670 0.749 0.372 0.886 # ... with 121 more rows, and 31 more variables: flag_conv_1_filters <int>, # flag_conv_1_kernel <int>, flag_conv_1_pooling <int>, flag_conv_1_dropout <dbl>, # flag_conv_2_filters <int>, flag_conv_2_kernel <int>, flag_conv_2_pooling <int>, # flag_conv_2_dropout <dbl>, flag_dense_1_nodes <int>, flag_dense_1_dropout <dbl>, # flag_dense_2_nodes <int>, flag_dense_2_dropout <dbl>, flag_mini_batch_size <int>, # samples <int>, validation_samples <int>, batch_size <int>, epochs <int>, ls_runs(eval_acc > 0.985, order = eval_acc) %>% as_tibble()

26 Training Tuning Paperspace / AWS Cloud ML cloudml tfruns::
tuning_run()

tfruns::flags() 27 FLAGS <- tfruns::flags( flag_integer("conv_1_filters", 16), flag_integer("conv_1_kernel", 32), flag_integer("conv_1_pooling",
2), flag_numeric("conv_1_dropout", 0.25) ) model <- keras_model_sequential() %>% layer_conv_1d( input_shape = c(260, 3), filters = FLAGS$conv_1_filters, kernel_size = FLAGS$conv_1_kernel, activation = "relu" ) %>% …

training_run() using FLAGS 28 tfruns::training_run( "walking_flags.R", flags = list( conv_1_filters
= 32, conv_1_kernel = 8 ) ) tfruns::training_run( "walking_flags.R", flags = list( conv_1_filters = 64, conv_1_kernel = 8 ) )

Setting up a tuning run using flags 29 tfruns::tuning_run( "walking_flags.R",
sample = 128 / 1259712, flags = list( conv_1_filters = c(16, 32, 64), conv_1_kernel = c(8, 16, 32), conv_1_pooling = c(2, 4), conv_1_dropout = c(0.1, 0.2, 0.5), conv_2_filters = c(16, 32, 64), conv_2_kernel = c(8, 16, 32), conv_2_pooling = c(2, 4), conv_2_dropout = c(0.1, 0.2, 0.5), dense_1_nodes = c(32, 64, 128, 256), dense_1_dropout = c(0.1, 0.2, 0.5), dense_2_nodes = c(32, 64, 128, 256), dense_2_dropout = c(0.1, 0.2, 0.5) )) 1.2M total combinations of flags (sampled to 128 combinations)

Extract the best performing models 31 run_dir eval_acc eval_loss metric_loss
metric_acc metric_val_loss metric_val_acc * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 runs/20… 0.976 0.0857 0.0970 0.965 0.0633 0.981 2 runs/20… 0.965 0.112 0.132 0.957 0.0902 0.972 3 runs/20… 0.964 0.115 0.0829 0.973 0.101 0.969 4 runs/20… 0.963 0.125 0.140 0.954 0.101 0.968 5 runs/20… 0.960 0.139 0.0308 0.989 0.108 0.970 6 runs/20… 0.960 0.127 0.0376 0.991 0.0859 0.972 7 runs/20… 0.960 0.137 0.0444 0.984 0.114 0.969 8 runs/20… 0.959 0.160 0.0573 0.985 0.130 0.958 9 runs/20… 0.959 0.141 0.0621 0.979 0.0842 0.972 10 runs/20… 0.958 0.116 0.141 0.951 0.0931 0.966 ls_runs(order = "eval_acc") %>% head()

Compare the top two models 32 ls_runs(order = "eval_acc") %>%
head(2) %>% tfruns::compare_runs()

The cloudml package 33 Training Tuning Paperspace / AWS Cloud
ML tfruns cloudml

cloudml • https://tensorflow.rstudio.com/tools/cloudml/ • Scalable training of models built with
the keras, tfestimators, and tensorflow R packages. • On-demand access to training on GPUs, including Tesla P100 GPUs from NVIDIA®. • Hyperparameter tuning to optimize key attributes of model architectures in order to maximize predictive accuracy. 34

cloudml install and configure 35 devtools::install_github("rstudio/cloudml") cloudml::gcloud_install() cloudml::gcloud_init() Welcome! This
command will take you through the configuration of gcloud. Settings from your current configuration [default] are: core: account: somename@gmail.com disable_usage_reporting: 'True' project: tensorflow-cloud-demo Pick configuration to use: [1] Re-initialize this configuration [default] with new settings [2] Create a new configuration Please enter your numeric choice:

cloudml::cloudml_train() • Train on default CPU instance: • Automatically uploads
contents of working directory along with script • Automatically installs all required R packages on Cloud ML servers 36 # Train on a GPU instance cloudml_train("walking_cloudml.R", master_type = "standard_gpu") # Train on an NVIDIA Tesla P100 GPU cloudml_train(" walking_cloudml.R", master_type = "standard_p100") library(cloudml) cloudml_train("mnist_mlp.R")

Using the Cloud ML console • https://console.cloud.google.com/mlengine/ 38

cloudml hyperparameter tuning 40 Submitting training job to CloudML... Job
'cloudml_2018_04_25_084916940' successfully submitted. View job in the Cloud Console at: https://console.cloud.google.com/ml/jobs/cloudml_2018_04_25_084916940?project=tensorflow-cloud-demo View logs at: https://console.cloud.google.com/logs?resource=ml.googleapis.com%2Fjob_id%2Fcloudml_2018_04_25_0849169 40&project=tensorflow-cloud-demo Check job status with: job_status("cloudml_2018_04_25_084916940") Collect job output with: job_collect("cloudml_2018_04_25_084916940") After collect, view with: view_run("runs/cloudml_2018_04_25_084916940") cloudml_train("mnist_cnn_cloudml.R", config = "tuning.yml")

Inspect the tuning trial results (and cleaning up) 41 #
A tibble: 23 x 14 objectiveValue conv_1_dropout conv_1_filters conv_1_kernel conv_1_pooling conv_2_dropout conv_2_filters <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 0.971 0.100 128. 24. 2. 0.200 128. 2 0.970 0.100 128. 24. 2. 0. 128. 3 0.969 0. 128. 24. 2. 0.500 128. 4 0.964 0.100 128. 24. 4. 0.200 128. 5 0.964 0.200 128. 24. 4. 0.100 128. 6 0.960 0.100 128. 24. 2. 0. 128. 7 0.960 0.200 128. 24. 2. 0.100 128. 8 0.958 0. 16. 24. 2. 0.200 64. 9 0.957 0. 16. 32. 2. 0.100 128. 10 0.957 0.200 128. 24. 4. 0.100 128. job_trials() %>% as_tibble() %>% select(-one_of("hyperparameters.data_dir")) mutate_at(vars(starts_with("hyperParameters")), round, 2) %>% rename_all(~sub("finalMetric.", "", .)) %>% rename_all(~sub("hyperparameters.", "", .)) %>% select(-c(trainingStep)) %>% arrange(desc(objectiveValue))

Inspect the tuning trial results 42 library(ggplot2) trials_clean %>% ggplot(aes(x
= trialId, y = objectiveValue)) + geom_point() + stat_smooth(method = "lm") + theme_bw(20)

Collect the best performing trial 43 job_collect(trials = "best") http://rpubs.com/Andrie/cloudml_tuning_run_MNIST_best_trial

Resources 44

Recommended reading on neural nets and TensorFlow Chollet and Allaire
Goodfellow, Bengio & Courville 45

Resources • https://tensorflow.rstudio.com/tools/gpu.html • https://tensorflow.rstudio.com/tools/tfruns/articles/overview.html • https://tensorflow.rstudio.com/tools/cloudml/articles/getting_started.html 46

Summary • For some neural network training tasks, consider GPU
machines • You have options for using GPU in the cloud: • Paperspace / AWS machines • Google Cloud ML • You have tools to do this: • tfruns • cloudml 47

eRum2018: Tools for using TensorFlow with R

eRum2018: Tools for using TensorFlow with R

More Decks by Andrie de Vries

Other Decks in Programming

Featured

Transcript