Machine and Deep Learning with Spring Cloud Data Flow (Spring Meetup NL)

Machine and Deep Learning with Spring Cloud Data Flow (Spring Meetup NL)

Spring Meetup (NL):

The Machine Learning (ML) and Deep Learning (DL) have disturbed the software engineering field. Techniques such as computer vision and language processing, have brought unprecedented ability to the common software development practitioners. Furthermore, the Scientists and Software Engineers can use the ML / DL models, while the software developers can apply and use those pre-trained models in production.

You can significantly simplify the task of implementing and operating the ML / DL models. In this talk, Christian will demonstrate how to use pre-trained computer vision and twitter sentiment analysis TensorFlow models in regular Spring Cloud Streams pipelines.

Furthermore we will review several portable ML / DL formats, such as PMML, PFA, MLeap and Tensor Flow and Spring Cloud Data Flow.

Christian Tzolov is a Principal Software Engineer at Pivotal, where he works at the Spring Cloud Data Flow team. He is an Apache Committer and Apache Crunch PMC Member. Christian is an OSS lawyer, and interested in Integration and Interoperability architectures for Distributed and Data-Intensive systems.


Christian Tzolov

March 29, 2018


  1. 1.

    Machine & Deep Learning with Spring Cloud Data Flow Christian

    Tzolov Pivotal Engineer, Spring Cloud Data Flow Apache Committer, Crunch PMC member
  2. 2.

    Industry Trends - Enterprises are adopting DevOps practices in their

    transition into software and data-driven businesses. - ETL integration with existing systems, and modernization efforts are still very important. - Continuous Event processing is becoming mainstream. - Integration of IoT data flows and Machine Learning/ Deep Learning algorithms
  3. 3.

    - Brings unprecedented abilities to the Software Engineering field. -

    Provides a different way to reason about problems - Solves “un-programmable” tasks Machine / Deep Learning (ML/DL)
  4. 5.

    Spoiler: Spring Cloud Data Flow (SCDF) would tackle the ML

    integration complexity Image Recognition TensorFlow Demo:
  5. 6.

    - Observations about an uncertain world - Experiments with train

    datasets - Statistics to analyze the results The ML Paradigm
  6. 7.

    - Phase 1: Train model on historical datasets - Phase

    2: Run pre-trained model for predictive analytics ML/DL Life-cycle
  7. 8.

    Model inference for predictive analytics is the most common use

    of ML/DL in Java applications. For Java practitioners?
  8. 9.

    - ML Model Reusability: PMML, PFA, MLeap, ONNX … TensorFlow

    - Model Serving vs Embedding Inference Considerations Java Process Pre-trained ML Model Output Predictions Stream Input Data Stream External System Java Process Pre-trained ML Model Output Predictions Stream Input Data Stream
  9. 10.

    Input data Stream - Real-Time ML Inference - Embedded Pre-trained

    Models - PMML & TensorFlow ML models Reference Architecture Java Process Pre-trained ML Model Output Predictions
  10. 11.

    Species Prediction Iris Flower Dataset: Naive Bayes classifier:

    SCDF Sample: current/reference/htmlsingle/#_data_science
  11. 14.

    Spring Cloud Data Flow a toolkit for building data integration,

    real-time, and batch data processing pipelines
  12. 15.

    a toolkit for building data integration, real-time, and batch data

    processing pipelines Spring Cloud Stream a event-driven microservice framework - eliminate boilerplate when developing messaging apps - pluggable messaging middleware abstraction - durable publish/subscribe semantics - data partitioning - schema evolution and management
  13. 16.

    Stream A Processor Processor Source Sink Transport Middleware - DSL

    inspired by Unix Pipes & Filters - Source | Processor* | Sink - Data payload flows through some transport abstraction Spring Cloud Streams Source Processor option Sink stream create demo --deploy --definition "http | transform --expression=payload.toUpperCase() | file" - Example:
  14. 17.

    a toolkit for building data integration, real-time, and batch data

    processing pipelines Spring Cloud Task a short-lived microservice framework - end-to-end auditing - snapshotting and checkpointing for replays - pluggable task repository abstraction - remote partitioning
  15. 18.

    a toolkit for building data integration, real- time, and batch

    data processing pipelines Source Processor Sink file ftp gemfire gemfire-cq http jdbc jms load-generato loggregator mail mongodb mqtt rabbit s3 sftp syslog tcp tcp-client time trigger triggertask twitterstream aggregator bridge filter groovy-filter groovy-transform header-enricher httpclient pmml python-http python-jython scriptable-transform splitter tasklaunchrequest- transform tcp-client tensorflow transform twitter-sentiment aggregate-counter cassandra counter field-value-counter file ftp gemfire gpfdist hdfs hdfs-dataset jdbc log mongodb mqtt pgcopy rabbit redis-pubsub router s3 sftp task-launcher-cloudfoundry task-launcher-local task-launcher-yarn tcp throughput websocket Task composed-task-runner jdbchdfs-local spark-client spark-cluster spark-yarn timestamp timestamp-batch Streaming Apps Batch/Task Apps
  16. 21.

    References [1] PMML - Predictive Model Markup Language ( Predictive_Model_Markup_Language)

    [2] Spring Cloud Data Flow (SCDF): [3] Image-Recognition Demo Video: v=bvDM7_CKQjo&t=38s [4] Spices Prediction PMML Sample: dataflow-samples/docs/current/reference/htmlsingle/#_data_science [5[ SCDF Twitter Sentiment Analysis (Tensorflow): [6] SCDF Object Detection Tensorflow Processor: cloud-stream-app-starters/tensorflow/tree/master/spring-cloud-starter-stream- processor-object-detection [7] Object Detection Example: v=2uOtImHKtgI&t=2s [8] Spring Cloud Stream: [9] Spring Cloud Task: