Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning in Spark with BigDL by Petar Zece...

Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017

BigDL is a deep learning framework modeled after Torch and open-sourced by Intel in 2016. BigDL runs on Apache Spark, a fast, general, distributed computing platform that is widely used for Big Data processing and machine learning tasks.

https://www.bigdataspain.org/2017/talk/deep-learning-in-spark-with-bigdl

Big Data Spain 2017
16th -17th November Kinépolis Madrid

Big Data Spain

November 22, 2017
Tweet

More Decks by Big Data Spain

Other Decks in Technology

Transcript

  1. Agenda for today Brie y about Spark Brie y about

    Deep Learning Di erent options for DL on Spark Intel BigDL library Q & A
  2. Show of hands I've never used Apache Spark I've played

    around with it I'm planning to or I'm already using Spark in production
  3. Show of hands 2 I'm beginner at deep learning I've

    built a few DL models I build DL models for living
  4. Why Spark? Spark is fast Simple and concise API Spark

    is a unifying platform Spark has gone mainstream
  5. About deep learning Family of machine learning methods Inspired by

    functioning of the nervous system Learning units are organized in layers Started in the 60's Again popular due to algorithmic advances and rise of computing resources Every month brings new advances (e.g. "capsule networks")
  6. Deep learning applications Computer vision Speech recognition Natural language processing

    Handwriting transcription Recommendation systems Better ad targeting Google Echo, Amazon Alexa
  7. Types of neural networks Convolutional NNs (CNNs) Region-based CNNs (R-CNNs)

    Single Shot MultiBox Detector (SSDs) Recurrent NNs (RNNs) Long short-term memory (LSTMs) Autoencoders Generative Adversarial Networks (GANs) Many other types
  8. Available frameworks Intel BigDL Tensor ow on Spark Databricks Deep

    Learning Pipelines Ca e on Spark Elephas (Keras) MXNet mmlspark (CNTK) Eclipse Deeplearning4j SparkCL SparkNet ...
  9. About Intel BigDL Open-sourced in February 2017 Uses Intel MKL

    for fast computations Integrated into Spark No GPU execution Python and Scala APIs Load/save Ca e, TF, Torch models A wide variety of layers, optim methods, loss functions
  10. Starting BigDL in local mode Add BigDL jar to the

    classpath Then... import com.intel.analytics.bigdl.utils.Engine System.setProperty("bigdl.localMode", "true") System.setProperty("bigdl.coreNumber", 8) Engine.init
  11. Starting BigDL on Spark Add BigDL jar to the classpath

    (--jars) Set cmdline parameters (standalone and Mesos): Set cmdline parameters (YARN): In your code... spark-submit --master spark... --executor-cores --total-executor-cores spark-submit --master yarn --executor-cores --num-executors import com.intel.analytics.bigdl.utils.Engine val conf = Engine.createSparkConf() val sc = new SparkContext(conf) Engine.init
  12. Creating a model Sequential model: Graph model: import com.intel.analytics.bigdl.nn._ val

    model = Sequential[Float]() model.add(SpatialConvolution[Float](...)) model.add(Tanh[Float]()) model.add(SpatialMaxPooling[Float](...) model.add(Sigmoid()) val input = Input[Float]() val conv = SpatialConvolution[Float](...).inputs(input) val tanh = Tanh[Float]().inputs(conv) val maxp = SpatialMaxPooling[Float](...).inputs(tanh) val sigm = Sigmoid[Float]().inputs(maxp) val model = Graph(input, sigm)
  13. Example model val model = Sequential[Float]() model.add(SpatialConvolution[Float](3, 32, 3, 3,

    1, 1, 1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil()) model.add(SpatialConvolution[Float](32, 64, 3, 3, 1, 1, 1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil()) model.add(SpatialConvolution[Float](64, 128, 3, 3, 1, 1, 1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil())
  14. Example model - continued model.add(SpatialConvolution[Float](128, 128, 3, 3, 1, 1,

    1, 1). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(SpatialMaxPooling[Float](kW=2, kH=2, dW=2, dH=2).ceil()) model.add(View(128*7*7)) modelv2.add(Dropout(0.4)) model.add(Linear[Float](inputSize=128*7*7, outputSize=512). setInitMethod(Xavier, Xavier)) model.add(ReLU(true)) model.add(Linear(inputSize=512, outputSize=1)) model.add(Sigmoid())
  15. Preparing the data - of cial example But... load is

    a private method! val trainSet = DataSet.array(load(trainData, trainLabel)) -> SampleToGreyImg(28, 28) -> GreyImgNormalizer(trainMean, trainStd) -> GreyImgToBatch(batchSize)
  16. Preparing the data val bytes:RDD[Sample[Float]] = sc.binaryFiles(folder). map(pathbytes => {

    val buffImage = ImageIO.read(pathbytes._2.open()) BGRImage.resizeImage(buffImage, SCALE_WIDTH, SCALE_HEIGHT }).map(b => new LabeledBGRImage().copy(b, 255f).setLabel(label) ).mapPartitions(iter => new BGRImgToSample()(iter) )
  17. Create an optimizer val optimizer = Optimizer(module, trainRdd, BCECriterion[Float](), batchSize)

    optimizer.setEndWhen(Trigger.maxEpoch(10)) optimizer.setOptimMethod(new Adam[Float](1e-4)) optimizer.setValidation(Trigger.severalIteration(10), testRdd, Array(new Loss[Float](new BCECriterion[Float]), new Top1Accuracy[Float]), batchSize)
  18. Tensorboard visualisation setup val trainSumm = TrainSummary("/tensorboard/logdir", "train") val testSumm

    = ValidationSummary("/tensorboard/logdir", "test optimizer.setTrainSummary(trainSumm) optimizer.setValidationSummary(testSumm) //start the optimization process: val trainedModule = optimizer.optimize()
  19. Optimization running [Epoch 2 18432/20000][Iteration 2 67][Wall Clock 888.091331139s] Trained

    144 records in 4.064710098 seconds. Throughput is 35.42688 records/second. Loss is 0.6683233. ========== Metrics Summary ========== get weights average : 0.2731059603333333 s computing time average : 0.742136533 s send weights average : 0.004483678833333333 s put gradient : 0.0018473921666666668 s aggregate gradient time : 0.004468877833333333 s aggregrateGradientParition average executor : 0.4345159966666667 s compute weight average : 0.006117520333333333 s get weights for each node : 0.03519228 0.03964764 0.027415565 0.040467617 computing time for each node : 0.550181791 0.765139897 0.894009244 0.89169 1 ===================================== DEBUG DistriOptimizer$: Dropped modules: 0
  20. Optimization running [Wall Clock 857.896149222s] Validate model... Loss is (Loss:

    80.587006, count: 126, Average Loss: 0.6395794) Top1Accuracy is Accuracy(correct: 634, count: 1000, accuracy: 0.634)
  21. Data augmentation def readAugmentedSamples(folder:String, label:Float, scaleHeight:Int=96, scaleWidth:Int=128, includeOriginal:Boolean=true, flip:Boolean=false, minRotate:Int=0,

    maxRotate:Int=40, rotatedInstances:Int=0, minShear:Double=0, maxShear:Double=0.2, shearedInstances:Int=0, minZoom:Double=0, maxZoom:Double=0.2, zoomedInstances:Int=0, minTranslate:Int=0, maxTranslate:Int=0, translatedInstances:Int=0) : RDD[Array[Byte]] { ... }
  22. Data augmentation var (resModule, resOptim) = runOptimizations(model, None, trainCats.union(trainDogs), testCats.union(testDogs),

    24*6, 2, 1) var optimizedModule : Module[Float] = resModule var optimMethod : Option[OptimMethod[Float]] = Some(resOptim) for(c <- 1 to 20) { trainCats.unpersist() trainDogs.unpersist() trainCats = readSamplesFromHDFSImages(...) trainDogs = readSamplesFromHDFSImages(...) val (mod, optim) = runOptimizations(optimizedModule, optimMethod, trainCats.union(trainDogs), testCats.union(testDogs), 24*6, 2, 1) optimizedModule = mod optimMethod = Some(optim) }
  23. Using the model trainedModule.saveModule(path) val quantizedModel = trainedModule.quantize() val validPredicts

    = quantizedModel.predict(validationSet) validPredicts.filter(a => a.toTensor[Float].value > 0.5).coun quantizedModel.evaluate(validationSet, Array(new Loss[Float](new BCECriterion[Float]), new Top1Accuracy[Float]), batchSize)
  24. Spark ML integration val dataSet = DataSet.rdd(byteRecordRdd) -> BytesToBGRImg(normalize=255f) ->

    BGRImgToBatch(batchSize, toRGB = false) val rdd = dataSet.asInstanceOf[DistributedDataSet[MiniBatch[Float]]]. data(false).map(batch => { val feature = batch.getInput().asInstanceOf[Tensor[Float]] val labels = batch.getTarget().asInstanceOf[Tensor[Float]] (feature.storage().array(), labels.storage().array()) }) spark.createDataFrame(rdd).toDF("features", "labels")
  25. Spark ML integration val criterion = BCECriterion[Float]() val featureSize =

    Array(3, 100, 100) val estimator = new DLClassifier[Float](model, criterion, featureSize). setFeaturesCol("features"). setLabelCol("labels"). setBatchSize(24*6). setLearningRate(1e-4). setMaxEpoch(20). setOptimMethod(new Adam[Float](1e-4)) val dlmodel:DLModel[Float] = estimator.fit(trainSet)
  26. Spark ML integration Can be used inside Spark ML Pipelines

    But... no access to Optimizer no validation no visualization not really useful yet
  27. Conclusion + Interesting and clever concept + Good engineering +

    Well optimized code + Lots of layers, optim methods etc. - Missing GPU support - Illogical package/class naming choices - API debug and data conversion options - Documentation could be better