Building and Deploying ML Applications on Production in a Fraction of the Time

Building and Deploying ML Applications on production in a fraction
of the time. #predictionio SF Data Mining

Available Open Source Tools Processing Framework • e.g. Apache Spark,
Apache Hadoop Algorithm Libraries • e.g. MLlib, Mahout Data Storage • e.g. HBase, Cassandra

Integrate everything together nicely and move from prototyping to production.
What is Missing?

You have a mobile app A Classic Recommender Example… App
Predict products You need a Recommendation Engine Predict products that a customer will like – and show it. Predictive model Algorithm - Predictive model - based on users’ behaviors

def pseudocode () { // Read training data  val trainingData
= sc.textFile("trainingData.txt").map(_.split(',') match  { …. }) // Build a predictive model with an algorithm  val model = ALS.train(trainingData, 10, 20, 0.01) // Make prediction  allUsers.foreach { user =>  model.recommendProducts(user, 5)  } } A Classic Recommender Example prototyping…

• How to deploy a scalable service that respond to
dynamic prediction query? • How do you persist the predictive model, in a distributed environment? • How to make HBase, Spark and algorithms talking to each other? • How should I prepare, or transform, the data for model training? • How to update the model with new data without downtime? • Where should I add some business logics? • How to make the code conﬁgurable, re-usable and maintainable? • How do I build all these with a separate of concerns (SoC)? Beyond Prototyping

Engine Event Server (data storage) Data: User Actions Query via
REST: User ID Predicted Result: A list of Product IDs A Classic Recommender Example on production… Mobile App

Data: User Actions Query via REST: User ID Predicted Result:
A list of Product IDs Engine Event Server (data storage) Mobile App Event Server

• $ pio eventserver • Event-based client.create_event( event="rate", entity_type="user", entity_id=“user_123”,
target_entity_type="item", target_entity_id=“item_100”, properties= { "rating" : 5.0 } ) Event Server Collecting Date

Query via REST: User ID Predicted Result: A list of
Product IDs Engine Data: User Actions Event Server (data storage) Mobile App Engine

• DASE - the “MVC” for Machine Learning • Data:
Data Source and Data Preparator • Algorithm(s) • Serving • Evaluator Engine Building an Engine with Separation of Concerns (SoC)

A. Train deployable predictive model(s) B. Respond to dynamic query
C. Evaluation Engine Functions of an Engine

Engine A. Train predictive model(s) Event Server Algorithm 1 Algorithm
3 Algorithm 2 PreparedDate Engine Data Preparator Data Source TrainingDate Model 3 Model 1 Model 2

Engine A. Train predictive model(s) class DataSource(…) extends PDataSource def
readTraining(sc: SparkContext) ==> trainingData class Preparator(…) extends PPreparator def prepare(sc: SparkContext, trainingData: TrainingData) ==> preparedData class Algorithm1(…) extends PAlgorithm def train(prepareData: PreparedData) ==> Model $ pio train

B. Respond to dynamic query Engine Algorithm 1 Model 1
Serving Mobile App Algorithm 3 Model 3 Algorithm 2 Model 2 Predicted Results Query (input) Predicted Result (output) Engine

B. Respond to dynamic query Engine • Query (Input) : 
  $ curl -H "Content-Type: application/json" -d   '{ "user": "1", "num": 4 }'   http://localhost:8000/queries.json case class Query( val user: String, val num: Int ) extends Serializable

B. Respond to dynamic query Engine • Predicted Result (Output): 
  {“itemScores”:[{"item":"22","score":4.072304374729956}, {"item":"62","score":4.058482414005789},  {"item":"75","score":4.046063009943821}]} case class PredictedResult( val itemScores: Array[ItemScore] ) extends Serializable case class ItemScore( item: String, score: Double ) extends Serializable

class Algorithm1(…) extends PAlgorithm def predict(model: ALSModel, query: Query) ==>
predictedResult class Serving extends LServing def serve(query: Query, predictedResults: Seq[PredictedResult]) ==> predictedResult B. Respond to dynamic query Engine Query via REST

Engine DASE Factory object RecEngine extends IEngineFactory { def apply()
= { new Engine( classOf[DataSource], classOf[Preparator], Map("algo1" -> classOf[Algorithm1]), classOf[Serving]) } }

• PredictionIO is a machine learning server for building and
deploying predictive engines  on production  in a fraction of the time. • Built on Apache Spark, MLlib and HBase. PredictionIO

Running on Production • Install PredictionIO  $ bash -c "$(curl
-s http://install.prediction.io/install.sh)" • Start the Event Server  $ pio eventserver • Deploy an Engine  $ pio build; pio train; pio deploy • Update Engine Model with New Data  $ pio train; pio deploy

Deploy on Production Website Mobile App Email Campaign Event Server
(data storage) Data Query via REST Predicted Result Engine 1 Engine 3 Engine 2 Engine 4

The Next Step • Quickstart with an Engine Template! •
Follow on Github: github.com/predictionio/ • Learn PredictionIO: prediction.io/ • Contribute!

Thanks. Simon Chan [email protected] @PredictionIO prediction.io (Newsletters) github.com/predictionio

Building and Deploying ML Applications on Produ...

Building and Deploying ML Applications on Production in a Fraction of the Time

Hakka Labs

More Decks by Hakka Labs

Other Decks in Programming

Featured

Transcript

Building and Deploying ML Applications on production in a fraction

Available Open Source Tools Processing Framework • e.g. Apache Spark,

Integrate everything together nicely and move from prototyping to production.

You have a mobile app A Classic Recommender Example… App

def pseudocode () { // Read training data  val trainingData

• How to deploy a scalable service that respond to

Engine Event Server (data storage) Data: User Actions Query via

Data: User Actions Query via REST: User ID Predicted Result:

• $ pio eventserver • Event-based client.create_event( event="rate", entity_type="user", entity_id=“user_123”,

Query via REST: User ID Predicted Result: A list of

• DASE - the “MVC” for Machine Learning • Data:

A. Train deployable predictive model(s) B. Respond to dynamic query

Engine A. Train predictive model(s) Event Server Algorithm 1 Algorithm

Engine A. Train predictive model(s) class DataSource(…) extends PDataSource def

B. Respond to dynamic query Engine Algorithm 1 Model 1

B. Respond to dynamic query Engine • Query (Input) :

B. Respond to dynamic query Engine • Predicted Result (Output):

class Algorithm1(…) extends PAlgorithm def predict(model: ALSModel, query: Query) ==>

Engine DASE Factory object RecEngine extends IEngineFactory { def apply()

• PredictionIO is a machine learning server for building and

Running on Production • Install PredictionIO  $ bash -c "$(curl

Deploy on Production Website Mobile App Email Campaign Event Server

The Next Step • Quickstart with an Engine Template! •

Thanks. Simon Chan [email protected] @PredictionIO prediction.io (Newsletters) github.com/predictionio