Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine and Deep Learning with Spring Cloud Data Flow (Spring Meetup NL)

Machine and Deep Learning with Spring Cloud Data Flow (Spring Meetup NL)

Spring Meetup (NL): https://www.meetup.com/Dutch-Spring-Meetup/events/247764263/

The Machine Learning (ML) and Deep Learning (DL) have disturbed the software engineering field. Techniques such as computer vision and language processing, have brought unprecedented ability to the common software development practitioners. Furthermore, the Scientists and Software Engineers can use the ML / DL models, while the software developers can apply and use those pre-trained models in production.

You can significantly simplify the task of implementing and operating the ML / DL models. In this talk, Christian will demonstrate how to use pre-trained computer vision and twitter sentiment analysis TensorFlow models in regular Spring Cloud Streams pipelines.

Furthermore we will review several portable ML / DL formats, such as PMML, PFA, MLeap and Tensor Flow and Spring Cloud Data Flow.

Christian Tzolov is a Principal Software Engineer at Pivotal, where he works at the Spring Cloud Data Flow team. He is an Apache Committer and Apache Crunch PMC Member. Christian is an OSS lawyer, and interested in Integration and Interoperability architectures for Distributed and Data-Intensive systems.

Christian Tzolov

March 29, 2018
Tweet

More Decks by Christian Tzolov

Other Decks in Technology

Transcript

  1. Machine & Deep Learning
    with
    Spring Cloud Data Flow
    Christian Tzolov
    Pivotal Engineer, Spring Cloud Data Flow
    Apache Committer, Crunch PMC member

    View Slide

  2. Industry Trends
    - Enterprises are adopting DevOps practices in their
    transition into software and data-driven businesses.
    - ETL integration with existing systems, and
    modernization efforts are still very important.
    - Continuous Event processing is becoming
    mainstream.
    - Integration of IoT data flows and Machine Learning/
    Deep Learning algorithms

    View Slide

  3. - Brings unprecedented abilities to the
    Software Engineering field.
    - Provides a different way to reason
    about problems
    - Solves “un-programmable” tasks
    Machine / Deep Learning (ML/DL)

    View Slide

  4. How ML/DL can help us to deliver
    richer business solutions?
    For Java Practitioners?

    View Slide

  5. Spoiler: Spring Cloud Data Flow
    (SCDF) would tackle the ML integration
    complexity
    Image Recognition TensorFlow Demo:

    View Slide

  6. - Observations about an
    uncertain world
    - Experiments with train datasets
    - Statistics to analyze the results
    The ML Paradigm

    View Slide

  7. - Phase 1: Train model on historical
    datasets
    - Phase 2: Run pre-trained model
    for predictive analytics
    ML/DL Life-cycle

    View Slide

  8. Model inference for predictive
    analytics is the most common use of
    ML/DL in Java applications.
    For Java practitioners?

    View Slide

  9. - ML Model Reusability:
    PMML, PFA, MLeap, ONNX … TensorFlow

    - Model Serving vs Embedding
    Inference Considerations
    Java
    Process
    Pre-trained
    ML Model
    Output
    Predictions
    Stream
    Input
    Data
    Stream
    External
    System
    Java
    Process
    Pre-trained
    ML Model
    Output
    Predictions
    Stream
    Input
    Data
    Stream

    View Slide

  10. Input data
    Stream
    - Real-Time ML Inference
    - Embedded Pre-trained Models
    - PMML & TensorFlow ML models
    Reference Architecture
    Java Process
    Pre-trained
    ML Model
    Output
    Predictions

    View Slide

  11. Species Prediction
    Iris Flower Dataset: https://en.wikipedia.org/wiki/Iris_flower_data_set
    Naive Bayes classifier: https://en.wikipedia.org/wiki/Naive_Bayes_classifier
    SCDF Sample: https://docs.spring.io/spring-cloud-dataflow-samples/docs/
    current/reference/htmlsingle/#_data_science

    View Slide


  12. Ingest
    Processing
    (predictions) Storage
    Let’s do a Twitter Sentiment Analysis!

    View Slide

  13. Let’s do a real-time Object Detection

    View Slide

  14. Spring Cloud Data Flow
    a toolkit for building data integration, real-time, and
    batch data processing pipelines

    View Slide

  15. a toolkit for building data integration, real-time,
    and batch data processing pipelines
    Spring Cloud Stream
    a event-driven microservice framework
    - eliminate boilerplate when developing messaging apps
    - pluggable messaging middleware abstraction
    - durable publish/subscribe semantics
    - data partitioning
    - schema evolution and management

    View Slide

  16. Stream A Processor Processor
    Source Sink
    Transport Middleware
    - DSL inspired by Unix Pipes & Filters
    - Source | Processor* | Sink
    - Data payload flows through some transport abstraction
    Spring Cloud Streams
    Source Processor option Sink
    stream create demo --deploy --definition
    "http | transform --expression=payload.toUpperCase() | file"
    - Example:

    View Slide

  17. a toolkit for building data integration, real-time,
    and batch data processing pipelines
    Spring Cloud Task
    a short-lived microservice framework
    - end-to-end auditing
    - snapshotting and checkpointing for replays
    - pluggable task repository abstraction
    - remote partitioning

    View Slide

  18. a toolkit for building data integration, real-
    time, and batch data processing pipelines
    Source Processor Sink
    file
    ftp
    gemfire
    gemfire-cq
    http
    jdbc
    jms
    load-generato
    loggregator
    mail
    mongodb
    mqtt
    rabbit
    s3
    sftp
    syslog
    tcp
    tcp-client
    time
    trigger
    triggertask
    twitterstream
    aggregator
    bridge
    filter
    groovy-filter
    groovy-transform
    header-enricher
    httpclient
    pmml
    python-http
    python-jython
    scriptable-transform
    splitter
    tasklaunchrequest-
    transform
    tcp-client
    tensorflow
    transform
    twitter-sentiment
    aggregate-counter
    cassandra
    counter
    field-value-counter
    file
    ftp
    gemfire
    gpfdist
    hdfs
    hdfs-dataset
    jdbc
    log
    mongodb
    mqtt
    pgcopy
    rabbit
    redis-pubsub
    router
    s3
    sftp
    task-launcher-cloudfoundry
    task-launcher-local
    task-launcher-yarn
    tcp
    throughput
    websocket
    Task
    composed-task-runner
    jdbchdfs-local
    spark-client
    spark-cluster
    spark-yarn
    timestamp
    timestamp-batch
    Streaming Apps Batch/Task Apps

    View Slide

  19. SCDF TensorFlow Processor

    View Slide

  20. Pivotal Data Suite

    View Slide

  21. References
    [1] PMML - Predictive Model Markup Language (https://en.wikipedia.org/wiki/
    Predictive_Model_Markup_Language)
    [2] Spring Cloud Data Flow (SCDF): https://cloud.spring.io/spring-cloud-dataflow/
    [3] Image-Recognition Demo Video: https://www.youtube.com/watch?
    v=bvDM7_CKQjo&t=38s
    [4] Spices Prediction PMML Sample: https://docs.spring.io/spring-cloud-
    dataflow-samples/docs/current/reference/htmlsingle/#_data_science
    [5[ SCDF Twitter Sentiment Analysis (Tensorflow): http://bit.ly/2DHpTfX
    [6] SCDF Object Detection Tensorflow Processor: https://github.com/spring-
    cloud-stream-app-starters/tensorflow/tree/master/spring-cloud-starter-stream-
    processor-object-detection
    [7] Object Detection Example: https://www.youtube.com/watch?
    v=2uOtImHKtgI&t=2s
    [8] Spring Cloud Stream: http://cloud.spring.io/spring-cloud-stream/
    [9] Spring Cloud Task: http://cloud.spring.io/spring-cloud-task/

    View Slide

  22. Keep in touch
    https://github.com/spring-cloud-stream-app-starters/tensorflow
    https://twitter.com/christzolov
    https://www.linkedin.com/in/tzolov

    View Slide