Slide 1

Slide 1 text

Alice in the world of machine learning

Slide 2

Slide 2 text

Roksolana Diachuk • Big Data Developer at Captify • Diversity & Inclusion ambassador for Captify Kyiv o ff ice • Women Who Code Kyiv Data Engineering Lead and Mentor • Speaker and traveller

Slide 3

Slide 3 text

In previous episodes talks…

Slide 4

Slide 4 text

Functional forest

Slide 5

Slide 5 text

TYPE SYSTEM IMMUTABLE DECLARATIVE DSL

Slide 6

Slide 6 text

magic-db- cluster-0

Slide 7

Slide 7 text

2 years later

Slide 8

Slide 8 text

magic-db- cluster-0 Long time no see! !

Slide 9

Slide 9 text

NAME READY STATUS AGE launcher-crd 1/1 Running 33s magic-db- cluster-0

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

3 months passed

Slide 12

Slide 12 text

Alice came back from vacation and decided to change something in her life. She realised that she got bored working as backend engineer and decided to switch to big data engineering.

Slide 13

Slide 13 text

"Do you know how to deploy machine learning models?”

Slide 14

Slide 14 text

“But what this model does?” “This model extracts speci fi c entities out of search queries“

Slide 15

Slide 15 text

“Sounds nice. But why me?” “I’ve heard that you have some experience with Kubernetes”

Slide 16

Slide 16 text

“So we really need your help with deploying these models”

Slide 17

Slide 17 text

Model Output data

Slide 18

Slide 18 text

A few days later

Slide 19

Slide 19 text

Kube fl ow ML fl ow Clipper Self-hosted Seldon

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Central dashboard Istio Kube fl ow pipelines Argo Katib TFJobs KFServing, Seldon,
 TFServing Data ingestion Data analysis Data transformation Data validation Trainer Model evaluation and validation Serving Logging Jupiter MinIO

Slide 23

Slide 23 text

Central dashboard Istio Kube fl ow pipelines Argo Katib TFJobs KFServing, Seldon,
 TFServing Data ingestion Data analysis Data transformation Data validation Trainer Model evaluation and validation Serving Logging Shared utilities for garbage collection, data access control Pipeline storage Shared con fi guration framework and job orchestration Integrated frontend for job management, monitoring, debugging, data/model visualisation Jupiter MinIO Tuner

Slide 24

Slide 24 text

“I wonder how it works with Kubernetes”

Slide 25

Slide 25 text

Central dashboard Istio Kube fl ow pipelines Argo Jupiter Katib MinIO TFJobs KFServing, Seldon,
 TFServing Data ingestion Data analysis Data transformation Data validation Trainer Model evaluation and validation Serving Logging Kubernetes operators Persistent volume Argo Work fl ows Ingress gateway Service account

Slide 26

Slide 26 text

No installations

Slide 27

Slide 27 text

gcloud services enable \ compute.googleapis.com \ container.googleapis.com \ iam.googleapis.com \ servicemanagement.googleapis.com \ cloudresourcemanagemer.googleapis.com \ ml.googleapis.com \ meshcon fi g.googleapis.com

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

“Can’t wait to see what I can do with that!”

Slide 30

Slide 30 text

Pipeline Model deploy Model training Data pre-processing

Slide 31

Slide 31 text

Model deploy Model training Data pre-processing Pipeline

Slide 32

Slide 32 text

CRD CRD Pod Pod API server Pipeline service DSL compiler component.py component.yaml Argo controllers Kube fl ow pipeline execution

Slide 33

Slide 33 text

Component is one step in Kube fl ow pipeline

Slide 34

Slide 34 text

Component structure

Slide 35

Slide 35 text

“But what should I do next?”

Slide 36

Slide 36 text

Docker image Docker registry Bash script

Slide 37

Slide 37 text

image_name = gcr.io/$PROJECT_ID/kube fl ow/train image_tag = latest
 full_image_name = ${image_name}:${image_tag} cd “$(dirname “$0”)” docker build —build-arg -t “${full_image_name}” . docker push “$full_image_name”

Slide 38

Slide 38 text

deploy preprocess train Name Tags latest latest latest

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

name: train description: Trains entity recognition model. inputs: - {name: Input x URI, type: GCSPath} - {name: Input y URI, type: GCSPath} - {name: Input job dir URI, type: GCSPath} - {name: Input tags, type: Integer} - {name: Input dropout, type: Integer} - {name: Output model URI template, type: GCSPath} …

Slide 41

Slide 41 text

outputs: - name: Output model URI - type: GCSPath implementation: container: image: gcr.io//kube fl ow/train:latest …

Slide 42

Slide 42 text

command: [ python3, /pipelines/component/src/train.py, —input-x-path, {inputValue: Input x URI}, —input-job-dir, {inputValue: Input job dir URI}, —input-y-path, {inputValue: Input y URI}, —input-tags, {inputValue: Input tags, type: Integer}, —input-words, {inputValue: Input dropout}, —input-dropout, {inputValue: Output model URI template}, —output-model-path- fi le, {outputPath: Output model URI} ]

Slide 43

Slide 43 text

Kube fl ow UI

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

train_operation = kfp.components.load_component_from_url (‘https://storage.googleapis.com/{}/train/preprocess/ component.yaml'.format(BUCKET)) Load component

Slide 46

Slide 46 text

deploy preprocess train Name

Slide 47

Slide 47 text

training component.yaml Container Registry

Slide 48

Slide 48 text

Create pipeline Compile pipeline Create Experiment Run pipeline

Slide 49

Slide 49 text

@dsl.pipeline( name=‘DS team Pipeline’, description='Performs preprocessing, training and deployment' ) def pipeline(): …
 Creating pipeline

Slide 50

Slide 50 text

pipeline_func = pipeline pipeline_ fi lename = pipeline_func._name_ + ‘.pipeline.zip’ Import kfp.compiler as compiler compiler.Compiler().compile(pipeline_func, pipeline_ fi lename)
 Compiling pipeline

Slide 51

Slide 51 text

client = kfp.Client() try: experiment = client.get_experiment(experiment_name=EXPERIMENT_NAME) experiment: experiment = client.create_experiment(EXPERIMENT_NAME)
 Creating experiment

Slide 52

Slide 52 text

arguments = {} run_name = pipeline_func.__name__ + ‘run' run_result = client.run_pipeline(experiment.id, run_name, pipeline_ fi lename,
 arguments) Running pipeline

Slide 53

Slide 53 text

Preprocess Train Deploy

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

Preprocess Train Deploy

Slide 56

Slide 56 text

pipeline run 1 pipeline run 2 pipeline run 3 Run name Status Accuracy score 98,4% 98,86% 97,65% Training metrics

Slide 57

Slide 57 text

Test your model with sample input data { “instances”: [London on Monday evening]} Test { “instances”: [weather in London today]}

Slide 58

Slide 58 text

{ “predictions”: [ [ “B-natural phenomenon”, “O”, “B-geographical entity”, “B-time indicator”, ] ] }

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

“It looks great! Thank you very much!” “You’re welcome!”

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

{ “instances”: [Pods and higher- order functions are in danger]} Test

Slide 64

Slide 64 text

To be continued…

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

Resources

Slide 67

Slide 67 text

My contact info dead_flowers22 roksolana-d roksolanadiachuk roksolanad