Slide 1

Slide 1 text

SF Data Mining January 26, 2015 Deploying predictive models Nick  Elprin   Domino  Data  Lab   dominodatalab.com

Slide 2

Slide 2 text

Who am I? SF Data Mining January 26, 2015 • Founder of Domino Data Lab, a software platform for enterprise data science
 
 
 • Previously built analytical software at a big hedge fund
 
 
 • BA, MS in computer science

Slide 3

Slide 3 text

Motivation SF Data Mining January 26, 2015 Build predictive models Build production software systems Different languages good for different tasks

Slide 4

Slide 4 text

Motivation SF Data Mining January 26, 2015 Organizational design friction Model improvements Data Scientists Software Engineering Delayed because of: • Integration / porting of logic • Out-of-phase release cycles • Mismatched priorities

Slide 5

Slide 5 text

Solution SF Data Mining January 26, 2015 Publish Consume Data scientists create predictive models and publish  them  to  Domino. Domino provides a secure, low- latency infrastructure for hosting predictive models as  web  services Developers can invoke models from general purpose languages by making simple  HTTP  calls • Failover • Security • Logging • Seamless updates • etc

Slide 6

Slide 6 text

Demo SF Data Mining January 26, 2015

Slide 7

Slide 7 text

Production concerns SF Data Mining January 26, 2015 •Very low latency •Zero-downtime upgrades •High availability •Reproducibility •Logging •Security

Slide 8

Slide 8 text

Best practices SF Data Mining January 26, 2015 •Separate training, initialization, and prediction •Make your prediction functions thread-safe •Don’t mutate any shared state •Leverage persistence/serialization tools (e.g., pickle)

Slide 9

Slide 9 text

Use cases SF Data Mining January 26, 2015 •Lease / loan approval •Recommendation systems •Music, books, products, cars, etc •Insurance •Quoting premiums; claims estimates

Slide 10

Slide 10 text

SF Data Mining January 26, 2015 dominodatalab.com blog.dominodatalab.com @dominodatalab Check us out Webinar on parallel 
 programming in R and Python. Jan 28, 10:30am dominodatalab.com/webinar