3
Define
Collect
Data
Clean
Data
Build
Model
Train
Model
Deploy
Model
Monitor/
Operate
Model
Slide 4
Slide 4 text
Pre-trained Model
Model artifacts (structure, helper files, parameters, weights, ...)
JSON or Binary format
Can be serialized (saved) and de-serialized (loaded)
4
Slide 5
Slide 5 text
5
Serverless!
Slide 6
Slide 6 text
6
Inference!
Slide 7
Slide 7 text
1.
Generic Architecture
What should a good design look like
7
Slide 8
Slide 8 text
8
Load
Balancer
Model
Server
Model
Artifacts
Slide 9
Slide 9 text
2.
Lambda
and S3 buckets..
9
Slide 10
Slide 10 text
10
API Gateway
Kinesis Streams
Where does the pre-trained model live
● Part of the lambda deployment
● On demand from S3
Lambda S3
Slide 11
Slide 11 text
Trade Offs
Size of the Model
Memory for a Lambda
Processing power for a Lambda
Multiple Models?
Latency of transfer from S3
11
Slide 12
Slide 12 text
2.
Sagemaker
12
Slide 13
Slide 13 text
13
Sagemaker
ECR
S3
Python SDK
Slide 14
Slide 14 text
What does the sage do?
14
Slide 15
Slide 15 text
Sagemaker
What does the Sagemaker do?
◍ Runs the Inference code
◍ Copies Models from S3
◍ Load Balancer
◍ Monitoring
◍ C-test on new Models
What does the Sagemaker need?
◍ Inference Code as ECR
◍ Location of Models on S3
◍ Type of Machine to Run
15
Slide 16
Slide 16 text
Specifications
16
/opt/ml/model } The directory to which the models are copied from S3
docker run serve } Sagemaker run command
/ping
/invocations
} Endpoints that must be defined
Slide 17
Slide 17 text
17
Sagemaker
ECR
S3
Python SDK
/ping
/invocations
Where the
models live