Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DevNexus 2018: Continuous Delivery for Data Pipelines

Sabby Anandan
February 22, 2018

DevNexus 2018: Continuous Delivery for Data Pipelines

Abstract: Continuously delivery is central to every software-driven organization. If you’re doing this today or you’d like to learn or extend the practice to “data-centric applications”, you will find this demo-driven talk supplementing your current methods and as well as provide useful context for future developments.

In this talk, we will review Spring Cloud Data Flow and Spring Cloud Skipper and how they come together to solve data integration and the continuous delivery challenges respectively. The day-to-day developer workflow including development, testing, CI, and the overall orchestration on cloud platforms (e.g., Cloud Foundry, Kubernetes) will be demonstrated.

Sabby Anandan

February 22, 2018
Tweet

More Decks by Sabby Anandan

Other Decks in Technology

Transcript

  1. Spring XD Spring Cloud Stream Spring Cloud Task Spring Cloud

    Skipper Spring Flo Spring Cloud Data Flow Sabby Anandan | @sabbyanandan
  2. Role of Data Integration Data Pipeline Concepts CI/CD & Data

    Pipelines Orchestrate All-the-Things 3
  3. 5

  4. a toolkit for building data integration, real-time streaming, and batch

    data processing pipelines 11 Spring Cloud Data Flow
  5. Data Integration Source Processor Sink file ftp gemfire gemfire-cq http

    jdbc jms load-generato loggregator mail mongodb mqtt rabbit s3 sftp syslog tcp tcp-client time trigger triggertask twitterstream aggregator bridge filter groovy-filter groovy-transform header-enricher httpclient pmml python-http python-jython scriptable-transform splitter tasklaunchrequest- transform tcp-client tensorflow transform twitter-sentiment aggregate-counter cassandra counter field-value-counter file ftp gemfire gpfdist hdfs hdfs-dataset jdbc log mongodb mqtt pgcopy rabbit redis-pubsub router s3 sftp task-launcher-cloudfoundry task-launcher-local task-launcher-yarn tcp throughput websocket Streaming Apps Task composed-task-runner jdbchdfs-local spark-client spark-cluster spark-yarn timestamp timestamp-batch Batch/Task Apps 12
  6. A DSL inspired by unix’s pipes and filter syntaxes “source

    | processor | … | processor | sink” “source | processor > :commonDestination” 13
  7. 14

  8. DEMO 111-22-3333 444-55-6666 777-88-9999 . . . Add prefix to

    each Payload 15 The Security Number = 111-22-3333 The Security Number = 444-55-6666 The Security Number = 777-88-9999 . . .
  9. Message Binder Abstraction Flexible messaging- middleware implementations Same code; same

    tests; runs exactly the same on different message brokers Google PubSub Active MQ IBM MQ Solace Amazon SQS Amazon Kinesis Rabbit MQ Apache Kafka Kafka Streams 18
  10. “The DSL doesn’t provide granular control over application lifecycle” “Rely

    on runtime- platform’s blue- green deployment support for rolling- upgrades” “Manually tweak and re-deploy application properties by- hand” “Changing deployment properties means, a new stream/task altogether” FEEDBACK 21
  11. 25 111-22-3333 444-55-6666 777-88-9999 . . . Mask each Payload

    The Security Number = xxx-xx—3333 The Security Number = xxx-xx-6666 The Security Number = xxx-xx-9999 . . . 25 DEMO Don’t Disturb Don’t Disturb Fix This!
  12. SCDF Shell Skipper Server REST SCDF Server REST Changes Detected

    Kubernetes / Cloud Foundry source sink source process process sink source process process sink Binders Stream App Deploy Delta 26 Diff Record Single Source of Truth
  13. Build Test Package IT Test Unit Test Candidate Stage Deploy

    to PROD E2E Test Deploy to PROD automatic automatic manual automatic automatic automatic Continuous Delivery Continuous Deployment 27 automatic automatic automatic automatic automatic
  14. V1 V2 V3 V4 V5 V6 All and every action

    is versioned … Single Source of Truth 28
  15. Resources: Spring Cloud Data Flow: http://cloud.spring.io/spring-cloud-dataflow Spring Cloud Skipper: http://cloud.spring.io/spring-cloud-skipper/

    Concourse: http://concourse.ci/ Demo: https://github.com/sabbyanandan/xfmr Keep Your Customers Happy!
  16. Resources: Bike to Work Day in San Francisco Unsafe bike

    lane disclaimer Aerial Of Dubai Highway Roads 4k