Upgrade to Pro — share decks privately, control downloads, hide ads and more …

eRum2018: Tools for using TensorFlow with R

eRum2018: Tools for using TensorFlow with R

Andrie de Vries

May 16, 2018
Tweet

More Decks by Andrie de Vries

Other Decks in Programming

Transcript

  1. Overview • Cloud GPU • R packages • tfruns •

    Cloudml • Resources 2 StackOverflow: andrie Twitter: @RevoAndrie GitHub: andrie
  2. Motivating example • Chest accelerometer data • UCI machine learning

    3 Activity Recognition from Single Chest-Mounted Accelerometer Data Set Example inspired from work by Mango Solutions
  3. Motivating example • Training data: • Samples of people walking

    • Acceleration in x, y and z direction 4 • Task • From the data, predict the person • 15 different people
  4. Starting model 5 keras_model_sequential() %>% layer_conv_1d( filters = 40, kernel_size

    = 30, strides = 2, activation = "relu", input_shape = c(260, 3) ) %>% layer_max_pooling_1d(pool_size = 2) %>% layer_conv_1d( filters = 40, kernel_size = 10, activation = "relu" ) %>% layer_max_pooling_1d(pool_size = 2) %>% layer_flatten() %>% layer_dense(units = 100, activation = "sigmoid") %>% layer_dense(units = 15, activation = "softmax") 1d convolution Useful for time series data and other sequences, including text ~95% on validation set
  5. Why should R users care about TensorFlow? • A new

    general purpose numerical computing library • Not all data has to be in RAM • Highly general optimization, e.g. SGD, Adam • TensorFlow models can be deployed with C++ runtime • R has a lot to offer as an interface language 7
  6. GPUs • Deep neural networks perform best on complex perceptual

    problems: • Very high capacity models • Trained on large amounts of data • This means you need: • Large computational capacity (GPU / TPU) • Time • Some applications are excruciatingly slow on CPU • In particular, image processing (convolutional networks) and sequence processing (recurrent neural networks) 10
  7. How much faster are GPU machines? • Many factors determine

    GPU performance • Complexity of the model • Mini-batch size • Convolution • Speed of getting data into GPU (input/output) • For recurrent tasks, GPU may not offer any advantage on CPU • For convolution, GPU and TPU can make a big difference 11
  8. TensorFlow tasks 20 Training Hyperparameter tuning Paperspace / AWS Cloud

    ML tfruns:: training_run() cloudml tfruns:: tuning_run() cloudml
  9. tfruns • https://tensorflow.rstudio.com/tools/tfruns/ • Successful deep learning requires a huge

    amount of experimentation. • This requires a systematic approach to conducting and tracking the results of experiments. • The training_run() function is like the source() function, but it automatically tracks and records output and metadata for the execution of the script 22
  10. tfruns::ls_runs() 24 # A tibble: 131 x 38 run_dir eval_loss

    eval_acc metric_loss metric_acc metric_val_loss metric_val_acc * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 runs/2018-05-15T0… 0.189 0.951 0.0125 0.998 0.181 0.951 2 runs/2018-05-15T0… 0.195 0.938 0.0245 0.992 0.175 0.948 3 runs/2018-05-14T2… 0.192 0.941 0.0965 0.967 0.143 0.950 4 runs/2018-05-14T2… 0.189 0.938 0.231 0.921 0.159 0.945 5 runs/2018-05-14T2… 0.194 0.936 0.250 0.904 0.168 0.940 6 runs/2018-05-14T2… 0.286 0.882 0.482 0.806 0.263 0.891 7 runs/2018-05-14T2… 0.402 0.853 0.377 0.851 0.351 0.867 8 runs/2018-05-14T2… 0.265 0.893 0.291 0.884 0.230 0.911 9 runs/2018-05-14T2… 0.139 0.960 0.0308 0.989 0.108 0.970 10 runs/2018-05-14T2… 0.403 0.873 0.670 0.749 0.372 0.886 # ... with 121 more rows, and 31 more variables: flag_conv_1_filters <int>, # flag_conv_1_kernel <int>, flag_conv_1_pooling <int>, flag_conv_1_dropout <dbl>, # flag_conv_2_filters <int>, flag_conv_2_kernel <int>, flag_conv_2_pooling <int>, # flag_conv_2_dropout <dbl>, flag_dense_1_nodes <int>, flag_dense_1_dropout <dbl>, # flag_dense_2_nodes <int>, flag_dense_2_dropout <dbl>, flag_mini_batch_size <int>, ls_runs() %>% as_tibble()
  11. tfruns::ls_runs() 25 # A tibble: 131 x 38 run_dir eval_loss

    eval_acc metric_loss metric_acc metric_val_loss metric_val_acc * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 runs/2018-05-15T0… 0.189 0.951 0.0125 0.998 0.181 0.951 2 runs/2018-05-15T0… 0.195 0.938 0.0245 0.992 0.175 0.948 3 runs/2018-05-14T2… 0.192 0.941 0.0965 0.967 0.143 0.950 4 runs/2018-05-14T2… 0.189 0.938 0.231 0.921 0.159 0.945 5 runs/2018-05-14T2… 0.194 0.936 0.250 0.904 0.168 0.940 6 runs/2018-05-14T2… 0.286 0.882 0.482 0.806 0.263 0.891 7 runs/2018-05-14T2… 0.402 0.853 0.377 0.851 0.351 0.867 8 runs/2018-05-14T2… 0.265 0.893 0.291 0.884 0.230 0.911 9 runs/2018-05-14T2… 0.139 0.960 0.0308 0.989 0.108 0.970 10 runs/2018-05-14T2… 0.403 0.873 0.670 0.749 0.372 0.886 # ... with 121 more rows, and 31 more variables: flag_conv_1_filters <int>, # flag_conv_1_kernel <int>, flag_conv_1_pooling <int>, flag_conv_1_dropout <dbl>, # flag_conv_2_filters <int>, flag_conv_2_kernel <int>, flag_conv_2_pooling <int>, # flag_conv_2_dropout <dbl>, flag_dense_1_nodes <int>, flag_dense_1_dropout <dbl>, # flag_dense_2_nodes <int>, flag_dense_2_dropout <dbl>, flag_mini_batch_size <int>, # samples <int>, validation_samples <int>, batch_size <int>, epochs <int>, ls_runs(eval_acc > 0.985, order = eval_acc) %>% as_tibble()
  12. tfruns::flags() 27 FLAGS <- tfruns::flags( flag_integer("conv_1_filters", 16), flag_integer("conv_1_kernel", 32), flag_integer("conv_1_pooling",

    2), flag_numeric("conv_1_dropout", 0.25) ) model <- keras_model_sequential() %>% layer_conv_1d( input_shape = c(260, 3), filters = FLAGS$conv_1_filters, kernel_size = FLAGS$conv_1_kernel, activation = "relu" ) %>% …
  13. training_run() using FLAGS 28 tfruns::training_run( "walking_flags.R", flags = list( conv_1_filters

    = 32, conv_1_kernel = 8 ) ) tfruns::training_run( "walking_flags.R", flags = list( conv_1_filters = 64, conv_1_kernel = 8 ) )
  14. Setting up a tuning run using flags 29 tfruns::tuning_run( "walking_flags.R",

    sample = 128 / 1259712, flags = list( conv_1_filters = c(16, 32, 64), conv_1_kernel = c(8, 16, 32), conv_1_pooling = c(2, 4), conv_1_dropout = c(0.1, 0.2, 0.5), conv_2_filters = c(16, 32, 64), conv_2_kernel = c(8, 16, 32), conv_2_pooling = c(2, 4), conv_2_dropout = c(0.1, 0.2, 0.5), dense_1_nodes = c(32, 64, 128, 256), dense_1_dropout = c(0.1, 0.2, 0.5), dense_2_nodes = c(32, 64, 128, 256), dense_2_dropout = c(0.1, 0.2, 0.5) )) 1.2M total combinations of flags (sampled to 128 combinations)
  15. 30

  16. Extract the best performing models 31 run_dir eval_acc eval_loss metric_loss

    metric_acc metric_val_loss metric_val_acc * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 runs/20… 0.976 0.0857 0.0970 0.965 0.0633 0.981 2 runs/20… 0.965 0.112 0.132 0.957 0.0902 0.972 3 runs/20… 0.964 0.115 0.0829 0.973 0.101 0.969 4 runs/20… 0.963 0.125 0.140 0.954 0.101 0.968 5 runs/20… 0.960 0.139 0.0308 0.989 0.108 0.970 6 runs/20… 0.960 0.127 0.0376 0.991 0.0859 0.972 7 runs/20… 0.960 0.137 0.0444 0.984 0.114 0.969 8 runs/20… 0.959 0.160 0.0573 0.985 0.130 0.958 9 runs/20… 0.959 0.141 0.0621 0.979 0.0842 0.972 10 runs/20… 0.958 0.116 0.141 0.951 0.0931 0.966 ls_runs(order = "eval_acc") %>% head()
  17. cloudml • https://tensorflow.rstudio.com/tools/cloudml/ • Scalable training of models built with

    the keras, tfestimators, and tensorflow R packages. • On-demand access to training on GPUs, including Tesla P100 GPUs from NVIDIA®. • Hyperparameter tuning to optimize key attributes of model architectures in order to maximize predictive accuracy. 34
  18. cloudml install and configure 35 devtools::install_github("rstudio/cloudml") cloudml::gcloud_install() cloudml::gcloud_init() Welcome! This

    command will take you through the configuration of gcloud. Settings from your current configuration [default] are: core: account: [email protected] disable_usage_reporting: 'True' project: tensorflow-cloud-demo Pick configuration to use: [1] Re-initialize this configuration [default] with new settings [2] Create a new configuration Please enter your numeric choice:
  19. cloudml::cloudml_train() • Train on default CPU instance: • Automatically uploads

    contents of working directory along with script • Automatically installs all required R packages on Cloud ML servers 36 # Train on a GPU instance cloudml_train("walking_cloudml.R", master_type = "standard_gpu") # Train on an NVIDIA Tesla P100 GPU cloudml_train(" walking_cloudml.R", master_type = "standard_p100") library(cloudml) cloudml_train("mnist_mlp.R")
  20. 37

  21. 39

  22. cloudml hyperparameter tuning 40 Submitting training job to CloudML... Job

    'cloudml_2018_04_25_084916940' successfully submitted. View job in the Cloud Console at: https://console.cloud.google.com/ml/jobs/cloudml_2018_04_25_084916940?project=tensorflow-cloud-demo View logs at: https://console.cloud.google.com/logs?resource=ml.googleapis.com%2Fjob_id%2Fcloudml_2018_04_25_0849169 40&project=tensorflow-cloud-demo Check job status with: job_status("cloudml_2018_04_25_084916940") Collect job output with: job_collect("cloudml_2018_04_25_084916940") After collect, view with: view_run("runs/cloudml_2018_04_25_084916940") cloudml_train("mnist_cnn_cloudml.R", config = "tuning.yml")
  23. Inspect the tuning trial results (and cleaning up) 41 #

    A tibble: 23 x 14 objectiveValue conv_1_dropout conv_1_filters conv_1_kernel conv_1_pooling conv_2_dropout conv_2_filters <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 0.971 0.100 128. 24. 2. 0.200 128. 2 0.970 0.100 128. 24. 2. 0. 128. 3 0.969 0. 128. 24. 2. 0.500 128. 4 0.964 0.100 128. 24. 4. 0.200 128. 5 0.964 0.200 128. 24. 4. 0.100 128. 6 0.960 0.100 128. 24. 2. 0. 128. 7 0.960 0.200 128. 24. 2. 0.100 128. 8 0.958 0. 16. 24. 2. 0.200 64. 9 0.957 0. 16. 32. 2. 0.100 128. 10 0.957 0.200 128. 24. 4. 0.100 128. job_trials() %>% as_tibble() %>% select(-one_of("hyperparameters.data_dir")) mutate_at(vars(starts_with("hyperParameters")), round, 2) %>% rename_all(~sub("finalMetric.", "", .)) %>% rename_all(~sub("hyperparameters.", "", .)) %>% select(-c(trainingStep)) %>% arrange(desc(objectiveValue))
  24. Inspect the tuning trial results 42 library(ggplot2) trials_clean %>% ggplot(aes(x

    = trialId, y = objectiveValue)) + geom_point() + stat_smooth(method = "lm") + theme_bw(20)
  25. Summary • For some neural network training tasks, consider GPU

    machines • You have options for using GPU in the cloud: • Paperspace / AWS machines • Google Cloud ML • You have tools to do this: • tfruns • cloudml 47