Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud Native Data Streaming Microservices With Spring Cloud and Kafka

Cloud Native Data Streaming Microservices With Spring Cloud and Kafka

[Talk given at the Kafka Summit NYC, May 8, 2017]

Marius Bogoevici

May 10, 2017
Tweet

More Decks by Marius Bogoevici

Other Decks in Programming

Transcript

  1. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Cloud Native Data Streaming
    Microservices
    with Spring Cloud and Kafka
    Marius Bogoevici, Pivotal
    @mariusbogoevici

    View full-size slide

  2. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Who am I ?
    • Software Engineer with Pivotal
    – Project Lead, Spring Cloud Stream
    • Spring ecosystem contributor since 2008:
    – Spring Integration, Spring XD, Spring Integration Kafka,
    – Spring Cloud Stream, Spring Cloud Data Flow
    • Co-author, “Spring Integration in Action”, Manning, 2012

    View full-size slide

  3. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Why microservices for data processing?
    • Cohesiveness around business capability (“do one thing and do it
    well”)
    • Organizational alignment (Conway’s Law), cross-team collaboration
    • Development agility
    • Optimized for replacement
    • Enable continuous delivery
    • Failure isolation
    • Granular resource tuning:
    • scaling out the critical parts of the pipeline
    • per-process: memory, CPU, network
    3

    View full-size slide

  4. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

    View full-size slide

  5. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Event-driven (Messaging) Microservices
    • Solving communication complexity for data processing
    • Decoupling:
    • Physical: discovery
    • Temporal: availability
    • Eventual consistency vs. shared stores/distributed transactions
    • especially over heterogenous resources
    • Pub-sub makes it easy to add new elements to the topology

    View full-size slide

  6. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Data and Event Streaming: Conceptually Similar
    Data Streaming: ingestion, analytics Async interaction, event sourcing

    View full-size slide

  7. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    7
    file
    jms
    http
    Kafka
    cassandra
    Solution : messaging microservices with Kafka

    View full-size slide

  8. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    8
    file
    jms
    http
    Kafka
    cassandra
    count-words
    Solution : messaging microservices with Kafka
    (and Kafka Streams!)

    View full-size slide

  9. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    How about operational complexity?
    • Distributed systems are inherently complex, operating them even more so
    • Operational prerequisites:
    • Self-servicing and provisioning
    • elastic infrastructure
    • Monitoring
    • Rapid delivery
    • CI/CD, deployment pipeline
    • automation
    • “DevOps culture”
    https://martinfowler.com/bliki/MicroservicePrerequisites.html
    9

    View full-size slide

  10. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    10

    View full-size slide

  11. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Cloud native applications and platforms
    • Target a platform that makes running apps reliable, transparent and boring
    • In-built resource management
    • Memory, CPU, networking
    • Elastic scaling
    • Monitoring and failover
    • Health, logging, metrics
    • Routing and load balancing
    • Rolling upgrades
    11
    Apache YARN
    Apache Mesos Kubernetes

    View full-size slide

  12. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    12
    file
    jms
    http
    Kafka
    Platform
    cassandra
    count-words
    Cloud-native event-driven microservices with Kafka

    View full-size slide

  13. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    The Monolith, the Platform and the Microservice(s)
    13
    Spring Cloud Stream
    2015
    Spring XD Spring Cloud Data Flow
    Spring Cloud Task

    View full-size slide

  14. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Simple things should be simple;
    complex things should be possible.
    — Alan Kay

    View full-size slide

  15. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Easy to build

    View full-size slide

  16. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Microservice development challenge: reducing the boilerplate code
    Monolith
    Boilerplate
    Business code
    In practice
    Microservices
    In theory

    View full-size slide

  17. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Boot
    • Spring Framework for microservices
    • Simplified application structure:
    • Opinionated autoconfiguration based on application dependencies
    • Elliminate boilerplate, focus on business code
    • Externalized configuration
    • Environment, run-time arguments, system properties
    • Immutable artifact
    • Uberjar with nested dependencies
    • Executes from command line
    • Management and monitoring (JMX, HTTP)
    • Actuator endpoints: metrics, health, pause/shutdown

    View full-size slide

  18. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Initializr: start.spring.io

    View full-size slide

  19. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Trend of projects generated with Spring Initializr

    View full-size slide

  20. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Stream
    • Event-driven microservice framework
    • Goal: simplify writing messaging applications, bias towards event/data
    streaming
    • Built on battle-tested components
    • Spring Boot: full-stack standalone applications
    • Spring Integration: Messaging, EIP patterns, connectors
    • Opinionated primitives provided as configuration options:
    • Durable publish-subscribe
    • Consumer groups
    • Semantic partitioning
    • Pluggable middleware abstractions
    20

    View full-size slide

  21. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Stream in a nutshell

    View full-size slide

  22. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Binder abstraction

    View full-size slide

  23. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    23

    View full-size slide

  24. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    24

    View full-size slide

  25. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    … 10000 ft nutshell

    View full-size slide

  26. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Programming model
    @EnableBinding + Binder Implementation
    Apache Kafka
    JMS Google PubSub
    Production-ready:
    Experimental

    View full-size slide

  27. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Programming model: individual message handling
    27
    @SpringBootApplication

    @EnableBinding(Processor.class)

    public class UppercaseProcessor {


    @StreamListener(“input”)

    @SendTo(“output”)

    public String process(String s) {

    return s.toUpperCase();

    }

    }

    View full-size slide

  28. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Stream + Kafka Streams
    • Overlapping paradigm: encapsulated input/output
    • Spring Cloud Stream + KStream
    • subscribes input KStreams to topics
    • connects the output KStream to topics
    • OOTB stateful processing support with KStream
    • Spring Cloud Stream content type negotiation
    • Or use Confluent Schema Registry directly
    • Underlying Spring Boot boot support
    • Flexible configuration: program arguments, environment variables
    • Actuator endpoints: health, metrics

    View full-size slide

  29. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Programming model : Kafka Streams
    29
    @StreamListener(“input”)
    @SendTo(“output”)

    public KStream wordCount(KStream,String> input) {

    return input.map((key, word) -> new KeyValue<>(word.toUppercase(), 1))
    .groupByKey()
    .count(“Counts”)
    .toStream()
    }


    View full-size slide

  30. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Stream + Kafka Streams
    input output
    Spring Cloud Stream KStream Processor
    words word-counts
    Spring Boot
    KStream API
    Spring Cloud Stream
    Programming model
    (developer focus)
    Application model (configuration options,
    StreamConfig based on Spring Boot properties,
    KStreamBuilder, KStream, lifecycle)
    Externalized
    configuration, uberjar
    construction, health monitoring
    endpoints
    (framework focus)

    View full-size slide

  31. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Programming model : Kafka Streams (2)
    31
    @StreamListener
    @SendTo(“clicksImpressions”)
    public KStream join(
    @Input(“clicks”) KStream clicks,
    @Input(“impressions”) KStream users){

    // join clicks and impressions
    }

    Functional programming model with
    multiple inputs and outputs

    View full-size slide

  32. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Easy to orchestrate and deploy

    View full-size slide

  33. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Simple topologies : (relatively) easy to deploy …
    33
    http
    HDFS
    spring.cloud.stream.bindings.output.destination=httphdfs
    spring.cloud.stream.bindings.input.destination=httphdfs
    spring.cloud.stream.bindings.input.group=httphdfs
    httphdfs.1

    View full-size slide

  34. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    … but how about complex topologies ?
    34
    http
    raw-sensor-data
    averages
    top-n
    Calculator
    Failure
    detector
    averages
    averages
    HDFS
    HDFS
    HDFS

    View full-size slide

  35. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Data Flow
    • Orchestration:
    • DSL for Stream topologies
    • REST API
    • Shell
    • UI
    • Portable Deployment SPI
    • OOTB apps for common integration use-cases
    35

    View full-size slide

  36. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Data Flow - Stream DSL
    36
    Stream definition
    Spring Boot Apps built
    with Spring Cloud Stream
    httpfile = http | file
    |

    View full-size slide

  37. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Spring Cloud Data Flow Deployment Platforms
    37
    Data Flow Server
    REST API
    Deployer SPI
    SCDF Flo
    SCDF Shell

    View full-size slide

  38. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    38
    cassandra
    gpfdist
    http
    stream1 = http | count-words | file
    stream2 = jms | cassandra
    Kafka
    Data Flow Server DB
    jms
    file
    jms
    http
    Kafka
    Platform
    cassandra
    count-words

    View full-size slide

  39. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Deployment: Partitioning and Instance Count
    39
    http
    http
    work
    work
    work
    hdfs
    hdfs
    hdfs
    hdfs
    Load Balancer
    stream create s1 --definition “http | work | hdfs”
    stream deploy s1 --propertiesFile ingest.properties
    app.http.count=2
    app.work.count=3
    app.hdfs.count=4
    app.http.producer.partitionKeyExpression=payload.id

    View full-size slide

  40. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a
    Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Deployment: Resource Management
    40
    http
    http
    work
    work
    work
    app.work.spring.cloud.deployer.cloudfoundry.memory=2048

    View full-size slide

  41. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Conclusions
    • Spring Cloud Stream, Spring Cloud Data Flow and Kafka are complementary
    • Kafka provides:
    • High-throughput, low latency messaging middleware (transport)
    • Stream processing engine via Kafka Streams
    • Spring Cloud Stream provides:
    • Spring Boot integration
    • Boilerplate reduction via opinionated application model
    • Spring Cloud Data Flow provides:
    • High-level orchestration for sophisticated topologies
    • Simplified deployment on a number of platforms: Cloud Foundry,
    Kubernetes, Mesos, Yarn

    View full-size slide

  42. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Some links …
    http://cloud.spring.io/spring-cloud-stream
    http://cloud.spring.io/spring-cloud-dataflow
    https://github.com/mbogoevici/spring-cloud-stream-binder-kstream
    https://github.com/spring-cloud/spring-cloud-stream-samples

    View full-size slide

  43. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-
    NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
    Questions ?

    View full-size slide