Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Operationalising Data Science on Cloud Foundry

Ian Huston
September 28, 2016

Operationalising Data Science on Cloud Foundry

Delivered at CF Summit Europe 2016 in Frankfurt, 28th September 2016

The first part of this talk argues why Cloud Foundry is ideal for operationalising and scaling machine learning models.

The core of the talk describes a scalable cloud native architecture for operationalising data science models based on microservices. There is a video demo of this architecture in action.

The last part of the talk explains the limits of Cloud Foundry in the context of data science and what still has to be achieved for Cloud Foundry to become an end-to-end data science platform.

Ian Huston

September 28, 2016
Tweet

More Decks by Ian Huston

Other Decks in Technology

Transcript

  1. Operationalising
    Data Science on CF
    Ian Huston

    View Slide

  2. Who am I?
    ● Data Scientist working with clients at Pivotal Labs
    ● Cloud Foundry user
    ● Community Buildpack writer
    @ianhuston ihuston

    View Slide

  3. Everybody wants systems that
    are smarter, everybody wants
    systems that are more predictive,
    everybody wants everything
    scored, everybody wants to
    understand what’s the next best
    offer, next best opportunity,
    how to make things a little bit
    more efficient.
    Marc Benioff
    Forbes CIO Summit, March 7, 2016


    View Slide

  4. What’s the problem?
    If your Machine Learning model is not in production,
    it does not provide business value.
    A slide deck does not count as production!

    View Slide

  5. Who has this problem?
    Data Scientists Want their model to make
    an impact
    Developers Want to add ML to their app
    CIOs/CDOs Want return on ‘Big Data’
    investment
    Our Clients Want to implement their
    first ML models

    View Slide

  6. Day 1 Problems
    Load and Transform Data
    Train the Predictive Model
    Connect to Incoming Data
    Apply Model
    Take Action
    Run the Model

    View Slide

  7. ‘Scoring As A Service’
    Ingest Data
    Build Model
    elsewhere with
    offline data
    Serve Result
    or
    Take Action
    Apply Model
    Store
    Model

    View Slide

  8. ‘CF powered Learning’
    Ingest Data
    Build Model
    in CF or
    elsewhere
    Serve Result
    or
    Take Action
    Apply Model
    Store
    Model
    Batch
    Update

    View Slide

  9. In-Stream (Online) Learning
    Ingest Data
    Serve Result
    or
    Take Action
    Update &
    Apply Model

    View Slide

  10. How can I do this?
    Build it:
    Spring Cloud Data Flow
    Marketplace data services
    Spring Boot
    Python ML microservices
    Use it:
    Initial offerings from GE, IBM, Alpine, Bosch

    View Slide

  11. http://moves.cfapps.pez.pivotal.io/about

    View Slide

  12. Day 2 Problems
    Which predictions were made with the old model or new?
    Do I need to continue serving the old predictions?
    Which library versions were used with old model and new?
    Has the data schema changed in the underlying system?
    Can I provide the right inputs for the different model versions?
    Can I replay the stream?
    Update the Model!

    View Slide

  13. Using a Model Service
    ● Version control for model
    ● Parse data with varying schemas
    ● Serve appropriate model version
    based on consuming app
    ● Store underlying data for model
    re-training and reproducibility
    Ingest Data
    Model
    Service
    Serve Result
    or
    Take Action
    Apply Model
    Store
    Model

    View Slide

  14. What’s Next for Data Science on CF?
    More data services
    More demos of data science/ML models
    Examples of successful projects
    Building blocks for building your own ML services
    dsoncf.com @ianhuston

    View Slide