Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017

Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017

BigDL is a deep learning framework modeled after Torch and open-sourced by Intel in 2016. BigDL runs on Apache Spark, a fast, general, distributed computing platform that is widely used for Big Data processing and machine learning tasks.


Big Data Spain 2017
16th -17th November Kinépolis Madrid


Big Data Spain

November 22, 2017


  1. None
  2. Deep Learning in Spark with BigDL Petar Zečević petar.zecevic@svgroup.hr https://hr.linkedin.com/in/pzecevic

  3. Apache Spark Zagreb Meetup group http://www.meetup.com/Apache-Spark-Zagreb- Meetup

  4. http://sparkinaction.com Giving away 3 e-books! (Come to my talk tomorrow)

  5. 40% off all Manning books! Use the code: ctwbds17

  6. Agenda for today Brie y about Spark Brie y about

    Deep Learning Di erent options for DL on Spark Intel BigDL library Q & A
  7. Show of hands I've never used Apache Spark I've played

    around with it I'm planning to or I'm already using Spark in production
  8. Show of hands 2 I'm beginner at deep learning I've

    built a few DL models I build DL models for living
  9. Apache Spark A distributed data processing engine

  10. Why Spark? Spark is fast Simple and concise API Spark

    is a unifying platform Spark has gone mainstream
  11. Basic architecture

  12. Spark API components

  13. About deep learning Family of machine learning methods Inspired by

    functioning of the nervous system Learning units are organized in layers Started in the 60's Again popular due to algorithmic advances and rise of computing resources Every month brings new advances (e.g. "capsule networks")
  14. Deep learning applications Computer vision Speech recognition Natural language processing

    Handwriting transcription Recommendation systems Better ad targeting Google Echo, Amazon Alexa
  15. Types of neural networks Convolutional NNs (CNNs) Region-based CNNs (R-CNNs)

    Single Shot MultiBox Detector (SSDs) Recurrent NNs (RNNs) Long short-term memory (LSTMs) Autoencoders Generative Adversarial Networks (GANs) Many other types
  16. General principle Adapted from Deep Learning with Python by F.

  17. A typical CNN (LeNet) Source: Wikipedia

  18. Convolutional layer Source: Wikipedia

  19. Maxpooling layer Source: Wikipedia

  20. Fully connected / Dense / Linear layer Source: Wikipedia

  21. Sigmoid activation Source: Wikipedia

  22. ReLU activation Source: Wikipedia

  23. Convolutional Network example

  24. AlexNet (Krizhevsky, Sutskever, Hinton)

  25. VGG (K. Simonyan and A. Zisserman)

  26. Inception

  27. Inception

  28. Deep learning on Apache Spark

  29. Available frameworks Intel BigDL Tensor ow on Spark Databricks Deep

    Learning Pipelines Ca e on Spark Elephas (Keras) MXNet mmlspark (CNTK) Eclipse Deeplearning4j SparkCL SparkNet ...
  30. About Intel BigDL Open-sourced in February 2017 Uses Intel MKL

    for fast computations Integrated into Spark No GPU execution Python and Scala APIs Load/save Ca e, TF, Torch models A wide variety of layers, optim methods, loss functions
  31. BigDL building blocks

  32. BigDL architecture

  33. Starting BigDL in local mode Add BigDL jar to the

    classpath Then... import com.intel.analytics.bigdl.utils.Engine System.setProperty("bigdl.localMode", "true") System.setProperty("bigdl.coreNumber", 8) Engine.init
  34. Starting BigDL on Spark Add BigDL jar to the classpath

    (--jars) Set cmdline parameters (standalone and Mesos): Set cmdline parameters (YARN): In your code... spark-submit --master spark... --executor-cores --total-executor-cores spark-submit --master yarn --executor-cores --num-executors import com.intel.analytics.bigdl.utils.Engine val conf = Engine.createSparkConf() val sc = new SparkContext(conf) Engine.init
  35. Creating a model Sequential model: Graph model: import com.intel.analytics.bigdl.nn._ val

    model = Sequential[Float]() model.add(SpatialConvolution[Float](...)) model.add(Tanh[Float]()) model.add(SpatialMaxPooling[Float](...) model.add(Sigmoid()) val input = Input[Float]() val conv = SpatialConvolution[Float](...).inputs(input) val tanh = Tanh[Float]().inputs(conv) val maxp = SpatialMaxPooling[Float](...).inputs(tanh) val sigm = Sigmoid[Float]().inputs(maxp) val model = Graph(input, sigm)
  36. Example model output 73% dog, 27% cat 82% cat, 18%

  37. Example model val model = Sequential[Float]() model.add(SpatialConvolution[Float](3, 32, 3, 3,

    1, 1, 1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil()) model.add(SpatialConvolution[Float](32, 64, 3, 3, 1, 1, 1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil()) model.add(SpatialConvolution[Float](64, 128, 3, 3, 1, 1, 1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil())
  38. Example model - continued model.add(SpatialConvolution[Float](128, 128, 3, 3, 1, 1,

    1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil()) model.add(View(128*7*7)) modelv2.add(Dropout(0.4)) model.add(Linear[Float](inputSize=128*7*7, outputSize=512). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(Linear(inputSize=512, outputSize=1)) model.add(Sigmoid())
  39. Preparing the data - of cial example But... load is

    a private method! val trainSet = DataSet.array(load(trainData, trainLabel)) -> SampleToGreyImg(28, 28) -> GreyImgNormalizer(trainMean, trainStd) -> GreyImgToBatch(batchSize)
  40. Preparing the data - transformers

  41. Preparing the data val bytes:RDD[Sample[Float]] = sc.binaryFiles(folder). map(pathbytes => {

    val buffImage = ImageIO.read(pathbytes._2.open()) BGRImage.resizeImage(buffImage, SCALE_WIDTH, SCALE_HEIGHT }).map(b => new LabeledBGRImage().copy(b, 255f).setLabel(label) ).mapPartitions(iter => new BGRImgToSample()(iter) )
  42. Create an optimizer val optimizer = Optimizer(module, trainRdd, BCECriterion[Float](), batchSize)

    optimizer.setEndWhen(Trigger.maxEpoch(10)) optimizer.setOptimMethod(new Adam[Float](1e-4)) optimizer.setValidation(Trigger.severalIteration(10), testRdd, Array(new Loss[Float](new BCECriterion[Float]), new Top1Accuracy[Float]), batchSize)
  43. Tensorboard visualisation setup val trainSumm = TrainSummary("/tensorboard/logdir", "train") val testSumm

    = ValidationSummary("/tensorboard/logdir", "test optimizer.setTrainSummary(trainSumm) optimizer.setValidationSummary(testSumm) //start the optimization process: val trainedModule = optimizer.optimize()
  44. Optimization running [Epoch 2 18432/20000][Iteration 2 67][Wall Clock 888.091331139s] Trained

    144 records in 4.064710098 seconds. Throughput is 35.42688 records/second. Loss is 0.6683233. ========== Metrics Summary ========== get weights average : 0.2731059603333333 s computing time average : 0.742136533 s send weights average : 0.004483678833333333 s put gradient : 0.0018473921666666668 s aggregate gradient time : 0.004468877833333333 s aggregrateGradientParition average executor : 0.4345159966666667 s compute weight average : 0.006117520333333333 s get weights for each node : 0.03519228 0.03964764 0.027415565 0.040467617 computing time for each node : 0.550181791 0.765139897 0.894009244 0.89169 1 ===================================== DEBUG DistriOptimizer$: Dropped modules: 0
  45. Optimization running [Wall Clock 857.896149222s] Validate model... Loss is (Loss:

    80.587006, count: 126, Average Loss: 0.6395794) Top1Accuracy is Accuracy(correct: 634, count: 1000, accuracy: 0.634)
  46. Tensorboard output - accuracy

  47. Tensorboard output - loss

  48. Tensorboard output - loss

  49. Data augmentation def readAugmentedSamples(folder:String, label:Float, scaleHeight:Int=96, scaleWidth:Int=128, includeOriginal:Boolean=true, flip:Boolean=false, minRotate:Int=0,

    maxRotate:Int=40, rotatedInstances:Int=0, minShear:Double=0, maxShear:Double=0.2, shearedInstances:Int=0, minZoom:Double=0, maxZoom:Double=0.2, zoomedInstances:Int=0, minTranslate:Int=0, maxTranslate:Int=0, translatedInstances:Int=0) : RDD[Array[Byte]] { ... }
  50. Data augmentation var (resModule, resOptim) = runOptimizations(model, None, trainCats.union(trainDogs), testCats.union(testDogs),

    24*6, 2, 1) var optimizedModule : Module[Float] = resModule var optimMethod : Option[OptimMethod[Float]] = Some(resOptim) for(c <- 1 to 20) { trainCats.unpersist() trainDogs.unpersist() trainCats = readSamplesFromHDFSImages(...) trainDogs = readSamplesFromHDFSImages(...) val (mod, optim) = runOptimizations(optimizedModule, optimMethod, trainCats.union(trainDogs), testCats.union(testDogs), 24*6, 2, 1) optimizedModule = mod optimMethod = Some(optim) }
  51. Tensorboard output - accuracy

  52. Tensorboard output - loss

  53. Using the model trainedModule.saveModule(path) val quantizedModel = trainedModule.quantize() val validPredicts

    = quantizedModel.predict(validationSet) validPredicts.filter(a => a.toTensor[Float].value > 0.5).coun quantizedModel.evaluate(validationSet, Array(new Loss[Float](new BCECriterion[Float]), new Top1Accuracy[Float]), batchSize)
  54. Transfer learning - model freeze

  55. Spark ML integration

  56. Spark ML integration val dataSet = DataSet.rdd(byteRecordRdd) -> BytesToBGRImg(normalize=255f) ->

    BGRImgToBatch(batchSize, toRGB = false) val rdd = dataSet.asInstanceOf[DistributedDataSet[MiniBatch[Float]]]. data(false).map(batch => { val feature = batch.getInput().asInstanceOf[Tensor[Float]] val labels = batch.getTarget().asInstanceOf[Tensor[Float]] (feature.storage().array(), labels.storage().array()) }) spark.createDataFrame(rdd).toDF("features", "labels")
  57. Spark ML integration val criterion = BCECriterion[Float]() val featureSize =

    Array(3, 100, 100) val estimator = new DLClassifier[Float](model, criterion, featureSize). setFeaturesCol("features"). setLabelCol("labels"). setBatchSize(24*6). setLearningRate(1e-4). setMaxEpoch(20). setOptimMethod(new Adam[Float](1e-4)) val dlmodel:DLModel[Float] = estimator.fit(trainSet)
  58. Spark ML integration Can be used inside Spark ML Pipelines

    But... no access to Optimizer no validation no visualization not really useful yet
  59. BigDL Performance benchmarks (https://software.intel.com/en-us/mkl/features/benchmarks)

  60. Conclusion + Interesting and clever concept + Good engineering +

    Well optimized code + Lots of layers, optim methods etc. - Missing GPU support - Illogical package/class naming choices - API debug and data conversion options - Documentation could be better
  61. Giving away 3 e-books! Come to the lecture tomorrow

  62. 40% off all Manning books! Use the code: ctwbds17

  63. Questions ?

  64. None