Slide 1

Slide 1 text

A real time ML pipeline proposal

Slide 2

Slide 2 text

2

Slide 3

Slide 3 text

Giulia Bianchi Loïc Divad Data Scientist @Xebia France @Giuliabianchl Software Engineer @Xebia France @Loicmdivad 3

Slide 4

Slide 4 text

G.Lo Taxi & Co. 4

Slide 5

Slide 5 text

G.Lo Taxi & Co. 5

Slide 6

Slide 6 text

Data Lover & Community Contributor Giulia Bianchi 6

Slide 7

Slide 7 text

G.Lo App 7 ● ○ ○ ○ ○ ● ○ ○ ○ ○ Given current location and destination estimate trip duration

Slide 8

Slide 8 text

Data Science 101 8

Slide 9

Slide 9 text

Batch inference 9 1 Historical data about taxi trips 2 Train a model to obtain a trained model 3 Use trained model to make batch predictions

Slide 10

Slide 10 text

Trip duration estimation at G.Lo taxi & Co. 10 ● ○ ●

Slide 11

Slide 11 text

Continuous inference 11 ● ○ ● 3 Use trained model to make 1 prediction Use trained model to make 1 prediction Use trained model to make 1 prediction

Slide 12

Slide 12 text

Streaming is the new batch 12 ● ● ●

Slide 13

Slide 13 text

> println(sommaire) ● ● ● ● ● ● ● 13

Slide 14

Slide 14 text

ML powered by Event Stream Apps 14

Slide 15

Slide 15 text

Apache Kafka Lover Loïc Divad 15

Slide 16

Slide 16 text

● ● ● ● ● The rise of event stream applications 16 Centralized Event Log

Slide 17

Slide 17 text

● ● ● ● ● The rise of event stream applications 17 Centralized Event Log

Slide 18

Slide 18 text

● ● ● ● ● The rise of event stream applications 18 Centralized Event Log

Slide 19

Slide 19 text

DISCLAIMER 19

Slide 20

Slide 20 text

The Challenge Constraints ● ○ ● ● ● ○ ○ 20

Slide 21

Slide 21 text

22 ● ● ● ● ● Kafka Streams application TensorFlow MODEL Kafka TOPICS What if, your model was an event stream app?

Slide 22

Slide 22 text

Working Environment 23

Slide 23

Slide 23 text

Project structure 24 ● ● ● ● . ├── build.gradle ├── edml-schema │ ├── build.gradle │ └── src ├── edml-scoring │ ├── build.gradle │ └── src ├── edml-serving │ ├── build │ ├── build.gradle │ └── src ├── edml-trainer │ ├── build.gradle │ └── setup.py └── terraform ├── ... └── ...

Slide 24

Slide 24 text

25 Working Environment GKE Kafka Streams Apps Kafka as a Service

Slide 25

Slide 25 text

Replay, an integration data stream 26 PICKUPS-2018-11-28 PICKUPS-REPLAY

Slide 26

Slide 26 text

31 AI Platform BigQuery Gitlab CI GKE Kafka Streams Apps GCE Kafka Connect Instances GCE KSQL Servers Kafka as a Service Control Center Working Environment

Slide 27

Slide 27 text

The model 32

Slide 28

Slide 28 text

Available data 33 Pick-up Location Pick-up Datetime Drop-off Location Drop-off Datetime Trip Duration Passenger Count Trip Distance Total Amount Tips

Slide 29

Slide 29 text

34 New York City zones

Slide 30

Slide 30 text

Wide features ● ○ day of week ○ hour of day ○ pick-up zone ○ drop-off zone 35

Slide 31

Slide 31 text

Deep features ● ○ day of year ○ hour of day ○ pick-up zone ○ drop-off Zone ○ passenger count 36

Slide 32

Slide 32 text

Wide and Deep Learning ● ● ○ → → ○ → → ● 37

Slide 33

Slide 33 text

Code organisation to run in GCP ● ● ○ ○ ○ ○ ● 38 $ edml-trainer/ . ├── setup.py └── trainer ├── __init__.py ├── model.py └── task.py

Slide 34

Slide 34 text

Code organisation to run in GCP 39 # task.py [page 1] from . import model if __name__ == '__main__': parser = argparse.ArgumentParser() # Input Arguments for ai-platfrom parser.add_argument( '--bucket', help='GCS path to project bucket', required=True ) .... # Input arguments for modeling parser.add_argument( '--batch-size', type=int, default=512 ) # task.py [page 2] parser.add_argument( '--output-dir', help='GCS location to write checkpoints and export models', required=True ) .... # assign arguments to model variables output_dir = arguments.pop('output_dir') model.BUCKET = arguments.pop('bucket') model.BATCH_SIZE = arguments.pop('batch_size') .... # Run the training job model.train_and_evaluate(output_dir)

Slide 35

Slide 35 text

Code organisation to run in GCP 40 # model.py [page 1] import tensorflow as tf BATCH_SIZE = 512 ... CSV_COLUMNS = [...] LABEL_COLUMN = "trip_duration" KEY_COLUMN = "uuid" def read_dataset(...): def _input_fn(): ... return _input_fn() # Feature engineering def get_wide_deep(): ... return wide, deep # model.py [page 2] # Serving input receiver function def serving_input_receiver_fn(): receiver_tensors = { ... } return tf.estimator.export.ServingInputReceiver(features, receiver_tensors) # Model training and evaluation def train_and_evaluate(output_dir): ... estimator = tf.estimator.DNNLinearCombinedRegressor(...) ... tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

Slide 36

Slide 36 text

Code organisation to run in GCP #!/usr/bin/env bash BUCKET=edml TRAINER_PACKAGE_PATH=gs://$BUCKET/data/taxi-trips/sources MAIN_TRAINER_MODULE="trainer.task" ... OUTDIR=gs://$BUCKET/ai-platform/models/$VERSION gcloud ai-platform jobs submit training $JOB_NAME \ --job-dir $JOB_DIR \ --package-path $TRAINER_PACKAGE_PATH \ --module-name $MAIN_TRAINER_MODULE \ --region $REGION \ -- \ --batch-size=$BATCH_SIZE \ --output-dir=$OUTDIR \ --pattern="*" \ --train-examples=174000 \ --eval-steps=1 41 gcloud

Slide 37

Slide 37 text

Development Workflow 42

Slide 38

Slide 38 text

Streaming apps deployment 44 ● ● ● ● KAFKA STREAMS APPS PODS KUBE MASTER // build.gradle compile group: 'org.tensorflow', name: 'proto', version: '1.15.0' compile group: 'org.tensorflow', name: 'tensorflow', version: '1.15.0' // Processor.scala import org.tensorflow._ val graphDef: GraphDef = GraphDef.parseFrom(Array.empty[Byte]) val graph = new Graph() graph.importGraphDef(graphDef.toByteArray) val session = new Session(graph)

Slide 39

Slide 39 text

45 The SavedModel Format from TF ● ● Graph ○ Not found: Resource … variable was uninitialized ● ○ ○ Serde $ tree my_model/ . ├── saved_model.pb └── variables ├── variables.data-00000-of-00002 ├── variables.data-00001-of-00002 └── variables.index

Slide 40

Slide 40 text

A model producer… for automation! 46 # ModelPublisher.scala val topic: String = "" val version: String = "" val model: String = "gs://.../" //… val producer = new KafkaProducer[_, TFSavedModel](... val key = ModelKey("") val value = TFSavedModel(… //… producer.send(topic, key, value) producer.flush()

Slide 41

Slide 41 text

2 input streams 47 APP CI DEPLOY STAGE MoDEL TOPIC NEW RECORDS PREDICTIONS ● ○ ○ ● ● ○ compact ●

Slide 42

Slide 42 text

48

Slide 43

Slide 43 text

49 Data Source Model Source Model Storage Current Model Processing Prediction Stream Processor RocksDB Key-Value Store

Slide 44

Slide 44 text

TEST Continuous integration 50 ► ► PACKAGE TRAIN DEPLOY MODEL 0.1.0-
- 0.1.0-
-- 0.1.0-
- {"metadata":"..."} DEPLOY KAFKA STREAMS APP

Slide 45

Slide 45 text

Scoring and Post Prediction 51

Slide 46

Slide 46 text

TensorBoard 53

Slide 47

Slide 47 text

54

Slide 48

Slide 48 text

55

Slide 49

Slide 49 text

Conclusion ● ● ● ● ● ● ● How to face a drop in the performance over time? 56

Slide 50

Slide 50 text

MERCI 57

Slide 51

Slide 51 text

PICTURES 58 Photo by Daniel Jensen on Unsplash Photo by Dimon Blr on Unsplash Photo by Lerone Pieters on Unsplash Photo by Miryam León on Unsplash Photo by Matthew Hamilton on Unsplash Photo by Luke Stackpoole on Unsplash Photo by Gustavo on Unsplash Photo by Negative Space from Pexels Photo by Gerrie van der Walt on Unsplash Photo by Eepeng Cheong on Unsplash Photo by Rock'n Roll Monkey on Unsplash Photo by chuttersnap on Unsplash Photo by Denys Nevozhai on Unsplash Photo by Mike Tsitas on Unsplash

Slide 52

Slide 52 text

ANNEX 59

Slide 53

Slide 53 text

Is this really a 1 month challenge? 60 Total cost: 302.82€

Slide 54

Slide 54 text

61 Team Work!

Slide 55

Slide 55 text

Training Job Manual Submission 62

Slide 56

Slide 56 text

63

Slide 57

Slide 57 text

A world of containers 64

Slide 58

Slide 58 text

Cloud is just someone else's computer 65

Slide 59

Slide 59 text

Data prep before TensorFlow, thank you Big Query 66

Slide 60

Slide 60 text

Schemas Are Service APIs For Event Streaming 67