Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event Driven Machine Learning

Loïc DIVAD
November 28, 2019

Event Driven Machine Learning

Loïc DIVAD

November 28, 2019
Tweet

More Decks by Loïc DIVAD

Other Decks in Programming

Transcript

  1. 2

  2. G.Lo App 7 • ◦ ◦ ◦ ◦ • ◦

    ◦ ◦ ◦ Given current location and destination estimate trip duration
  3. Batch inference 9 1 Historical data about taxi trips 2

    Train a model to obtain a trained model 3 Use trained model to make batch predictions
  4. Continuous inference 11 • ◦ • 3 Use trained model

    to make 1 prediction Use trained model to make 1 prediction Use trained model to make 1 prediction
  5. • • • • • The rise of event stream

    applications 16 Centralized Event Log
  6. • • • • • The rise of event stream

    applications 17 Centralized Event Log
  7. • • • • • The rise of event stream

    applications 18 Centralized Event Log
  8. 22 • • • • • Kafka Streams application TensorFlow

    MODEL Kafka TOPICS What if, your model was an event stream app?
  9. Project structure 24 • • • • . ├── build.gradle

    ├── edml-schema │ ├── build.gradle │ └── src ├── edml-scoring │ ├── build.gradle │ └── src ├── edml-serving │ ├── build │ ├── build.gradle │ └── src ├── edml-trainer │ ├── build.gradle │ └── setup.py └── terraform ├── ... └── ...
  10. 31 AI Platform BigQuery Gitlab CI GKE Kafka Streams Apps

    GCE Kafka Connect Instances GCE KSQL Servers Kafka as a Service Control Center Working Environment
  11. Available data 33 Pick-up Location Pick-up Datetime Drop-off Location Drop-off

    Datetime Trip Duration Passenger Count Trip Distance Total Amount Tips
  12. Wide features • ◦ day of week ◦ hour of

    day ◦ pick-up zone ◦ drop-off zone 35
  13. Deep features • ◦ day of year ◦ hour of

    day ◦ pick-up zone ◦ drop-off Zone ◦ passenger count 36
  14. Code organisation to run in GCP • • ◦ ◦

    ◦ ◦ • 38 $ edml-trainer/ . ├── setup.py └── trainer ├── __init__.py ├── model.py └── task.py
  15. Code organisation to run in GCP 39 # task.py [page

    1] from . import model if __name__ == '__main__': parser = argparse.ArgumentParser() # Input Arguments for ai-platfrom parser.add_argument( '--bucket', help='GCS path to project bucket', required=True ) .... # Input arguments for modeling parser.add_argument( '--batch-size', type=int, default=512 ) # task.py [page 2] parser.add_argument( '--output-dir', help='GCS location to write checkpoints and export models', required=True ) .... # assign arguments to model variables output_dir = arguments.pop('output_dir') model.BUCKET = arguments.pop('bucket') model.BATCH_SIZE = arguments.pop('batch_size') .... # Run the training job model.train_and_evaluate(output_dir)
  16. Code organisation to run in GCP 40 # model.py [page

    1] import tensorflow as tf BATCH_SIZE = 512 ... CSV_COLUMNS = [...] LABEL_COLUMN = "trip_duration" KEY_COLUMN = "uuid" def read_dataset(...): def _input_fn(): ... return _input_fn() # Feature engineering def get_wide_deep(): ... return wide, deep # model.py [page 2] # Serving input receiver function def serving_input_receiver_fn(): receiver_tensors = { ... } return tf.estimator.export.ServingInputReceiver(features, receiver_tensors) # Model training and evaluation def train_and_evaluate(output_dir): ... estimator = tf.estimator.DNNLinearCombinedRegressor(...) ... tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  17. Code organisation to run in GCP #!/usr/bin/env bash BUCKET=edml TRAINER_PACKAGE_PATH=gs://$BUCKET/data/taxi-trips/sources

    MAIN_TRAINER_MODULE="trainer.task" ... OUTDIR=gs://$BUCKET/ai-platform/models/$VERSION gcloud ai-platform jobs submit training $JOB_NAME \ --job-dir $JOB_DIR \ --package-path $TRAINER_PACKAGE_PATH \ --module-name $MAIN_TRAINER_MODULE \ --region $REGION \ -- \ --batch-size=$BATCH_SIZE \ --output-dir=$OUTDIR \ --pattern="*" \ --train-examples=174000 \ --eval-steps=1 41 gcloud
  18. Streaming apps deployment 44 • • • • KAFKA STREAMS

    APPS PODS KUBE MASTER // build.gradle compile group: 'org.tensorflow', name: 'proto', version: '1.15.0' compile group: 'org.tensorflow', name: 'tensorflow', version: '1.15.0' // Processor.scala import org.tensorflow._ val graphDef: GraphDef = GraphDef.parseFrom(Array.empty[Byte]) val graph = new Graph() graph.importGraphDef(graphDef.toByteArray) val session = new Session(graph)
  19. 45 The SavedModel Format from TF • • Graph ◦

    Not found: Resource … variable was uninitialized • ◦ ◦ Serde $ tree my_model/ . ├── saved_model.pb └── variables ├── variables.data-00000-of-00002 ├── variables.data-00001-of-00002 └── variables.index
  20. A model producer… for automation! 46 # ModelPublisher.scala val topic:

    String = "<model.topic>" val version: String = "<model.version>" val model: String = "gs://.../<model.version>" //… val producer = new KafkaProducer[_, TFSavedModel](... val key = ModelKey("<app.name>") val value = TFSavedModel(… //… producer.send(topic, key, value) producer.flush()
  21. 2 input streams 47 APP CI DEPLOY STAGE MoDEL TOPIC

    NEW RECORDS PREDICTIONS • ◦ ◦ • • ◦ compact •
  22. 48

  23. 49 Data Source Model Source Model Storage Current Model Processing

    Prediction Stream Processor RocksDB Key-Value Store
  24. TEST Continuous integration 50 ► ► PACKAGE TRAIN DEPLOY MODEL

    0.1.0-<dt>-<sha1> 0.1.0-<dt>-<sha1>-<N> 0.1.0-<dt>-<sha1> {"metadata":"..."} DEPLOY KAFKA STREAMS APP
  25. 54

  26. 55

  27. Conclusion • • • • • • • How to

    face a drop in the performance over time? 56
  28. PICTURES 58 Photo by Daniel Jensen on Unsplash Photo by

    Dimon Blr on Unsplash Photo by Lerone Pieters on Unsplash Photo by Miryam León on Unsplash Photo by Matthew Hamilton on Unsplash Photo by Luke Stackpoole on Unsplash Photo by Gustavo on Unsplash Photo by Negative Space from Pexels Photo by Gerrie van der Walt on Unsplash Photo by Eepeng Cheong on Unsplash Photo by Rock'n Roll Monkey on Unsplash Photo by chuttersnap on Unsplash Photo by Denys Nevozhai on Unsplash Photo by Mike Tsitas on Unsplash
  29. 63