Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making Stream Processing the Right Way in the Cloud using Apache Kafka

Making Stream Processing the Right Way in the Cloud using Apache Kafka

Nov 13th, 2018: Apache Kafka® Meetup in Raleigh, NC

Ricardo Ferreira

November 13, 2018
Tweet

More Decks by Ricardo Ferreira

Other Decks in Technology

Transcript

  1. 1
    Making Stream Processing
    the Right Way in the Cloud
    using Apache Kafka®
    Raleigh Apache Kafka Meetup
    Nov 13th, 2018 | Ricardo Ferreira
    [email protected]
    @riferrei

    View Slide

  2. 2
    I want to start off by
    telling you a story…

    View Slide

  3. 3
    The story
    about Bob…
    Bob Sheppard,
    20 years.
    …as a
    Software
    Engineer.

    View Slide

  4. 4
    …and how his story
    will help us with Kafka
    in the Cloud.
    Bob Sheppard,
    24 years

    View Slide

  5. 5
    In the following events – Bob
    will be depicted in the story as
    a regular Software Engineer.

    View Slide

  6. 6
    One day during a sprint planning…
    Product Owner: "Guys, I need an
    estimate about how long would take
    to put Apache Kafka in the Cloud"
    Software Engineer: "Sure Mr.
    Product Owner. Give me a day so I
    can come up with something solid
    to include in your plan."
    Product Owner: "And be as
    accurate as possible because
    DEV needs this for coding."
    "Hahaha"

    View Slide

  7. 7
    Workflow of the Software Engineer evaluating the complexity…
    Hello World with
    Apache Kafka on his
    laptop. Easy peasy!
    Handling data with
    producers and
    consumers. Done!
    Enough from this
    child thing: loading
    data from PROD
    Updating my
    LinkedIn profile with
    a new Skill…
    Sweet: Confluent has
    several Docker
    Images pre-built.
    Build a Docker image
    from the Hello World
    and run in the Cloud

    View Slide

  8. 8
    …and in the next day…

    View Slide

  9. 9
    During the next day meeting…
    Product Owner: "That is fantastic.
    Gonna write this down here with
    blood and fire, plus recording all…"
    Software Engineer: "Mr. Product
    Owner, I can easily put Apache Kafka
    in the Cloud in a couple weeks…"
    "Hahaha"

    View Slide

  10. 10
    …and throughout the two weeks, there was lots of interesting findings…
    Zookeeper ports
    cannot be exposed?
    Keeping data secure
    while in-transit?
    Keeping data
    secure at-rest?
    Plan ahead IP addresses
    to avoid overlapping?
    Enabling remote
    SSH access?
    Developer authentication should
    use BASIC AUTH or OAuth 2.0?
    Partitions auto-
    rebalancing?
    How do I prevent people to
    change the infrastructure?
    Who should own the
    infrastructure?
    How do I manage the upgrades of
    the Apache Kafka binaries?
    SOC-2 Type II
    What if I have a hybrid
    center of gravity?
    Avoid vendor lock-in or
    sell my soul to XYZ?
    Scale in or scale out?
    And without downtime?

    View Slide

  11. 11
    … we all know how this story ends…

    View Slide

  12. 12
    During Friday around 5:30 PM…

    View Slide

  13. 13
    If you didn't get the joke,
    please grab your phone
    and buy this movie now.
    As an IT professional,
    you owe yourself this.

    View Slide

  14. 14
    Bob's story teaches us three important lessons:
    You don't want to
    build infrastructure.
    You want to build
    software.
    1
    Distributed Systems
    are hard to manage.
    And Cloud makes it
    even harder.
    2
    Do you really know
    who Bob was in this
    story? No? Then
    check it out…
    3

    View Slide

  15. 15
    About me:
    • Developer Advocate @ Confluent
    • Having Fun with Coding since 1997
    • Ex-Oracle, Red Hat, IONA Technologies
    • Blog: https://riferrei.net
    • Twitter: @riferrei
    o Geek stuff and Apache Kafka®
    o DC/Marvel, Mindhunter Book
    • Brazilian, Husband and Father

    View Slide

  16. 16
    Options for Apache Kafka in the Cloud
    Running and
    Managing by
    Yourself
    1
    Running with
    Docker and
    Kubernetes
    2
    Infrastructure
    as Code,
    Ansible, Chef
    3
    Apache
    Kafka as a
    Service
    4

    View Slide

  17. 17
    Running and Managing by Yourself
    Running and
    Managing by
    Yourself
    1
    • Pretty much you need to build everything;
    except of course for Apache Kafka.
    • Pros:
    o Ability to own and control everything.
    o Lots of learning sources out there.
    • Cons:
    o Extremely complex to manage and scale.
    o High cost with TCO, infrastructure costs.

    View Slide

  18. 18
    Running with Docker and Kubernetes
    Running with
    Docker and
    Kubernetes
    2
    • Makes transparent where Apache Kafka is
    running; whether On-Premise or in the Cloud.
    • Pros:
    o Abstracts cluster and deployment details.
    o Tooling support: Confluent Operator, Pivotal
    PKS, Red Hat OpenShift, Cloud Extensions.
    • Cons:
    o Underlying infrastructure is still necessary.
    o There is people and infrastructure costs.

    View Slide

  19. 19
    Running with Docker and Kubernetes
    • Makes transparent where Apache Kafka is
    running; whether On-Premise or in the Cloud.
    • Pros:
    o Abstracts cluster and deployment details.
    o Tooling support: Confluent Operator, Pivotal
    PKS, Red Hat OpenShift, Cloud Extensions.
    • Cons:
    o Underlying infrastructure is still necessary.
    o There is people and infrastructure costs.

    View Slide

  20. 20
    Infrastructure as Code, Ansible, Chef
    Infrastructure
    as Code,
    Ansible, Chef
    3
    • Makes transparent where Apache Kafka is
    running; as well as all the infrastructure details.
    • Pros:
    o Abstracts cluster and deployment details.
    o Creates a immutable, repeatable, disposable
    infrastructure that can be managed as code.
    • Cons:
    o Vendor lock-in, provisioning complexity.
    o There is people and infrastructure costs.

    View Slide

  21. 21
    Apache Kafka as a Service
    Apache
    Kafka as a
    Service
    4
    • Apache Kafka is delivered to you as SaaS;
    leaving more time for coding and innovation.
    • Pros:
    o SLAs about performance and availability.
    o Pay-as-you-Go, predictable costs up ahead.
    o Multiple Cloud vendor choices = No lock-in.
    • Cons:
    o You don't have control over anything.
    o Something new to learn and be good at it.

    View Slide

  22. 22
    Apache Kafka as a Service = Confluent Cloud™
    • Meet Confluent Cloud: a scalable streaming
    data service based on Apache Kafka that is
    delivered to you 100% as a service.
    • Supported by the best Kafka engineers.
    • Best hybrid center of gravity support.
    • Multi-Cloud support: AWS, GCP, Others.
    • Apache Kafka APIs = No proprietary stuff.
    • Access to the Confluent Platform Tools
    • Professional and Enterprise Plans.

    View Slide

  23. 23
    Demo

    View Slide

  24. 24
    Confluent Cloud Tools Project
    • How about using the Confluent Platform
    ecosystem in a snap with your cluster?
    • Say hello to Confluent Cloud Tools:
    • https://github.com/confluentinc/ccloud-tools
    • Creates a highly-available, Multi-AZ, fully
    secure within a VPC – set of tools that are
    automatically connected into you Confluent
    Cloud cluster.

    View Slide

  25. 25
    Confluent Cloud Tools Project
    VPC VPC
    Availability Zone 1
    Private Subnet
    Schema
    Registry 1
    Private Subnet
    REST
    Proxy 1
    Private Subnet
    KSQL
    Server 1
    Availability Zone 2
    Private Subnet
    Schema
    Registry 2
    Private Subnet
    REST
    Proxy 2
    Private Subnet
    KSQL
    Server 2
    Multi AZ for HA
    Public Subnet
    Public Internet
    VPC Peering
    OR
    Scaling Out
    Across AZ's
    Schema
    Registry
    REST
    Proxy
    KSQL
    Server
    Bastion
    Server

    View Slide

  26. 26
    Demo

    View Slide

  27. 27
    "The Cube"
    Demo

    View Slide

  28. 29
    Confluent Cloud "The Cube" Demo
    Name Motion (X) Motion (Y) Motion (Z)
    Alice -45 0 12
    Bob 1 -185 -90
    John 0 -1 90
    Steve -180 0 180
    Number X Y Z
    1 1 0 0
    2 1 -90 1
    3 -180 0 180
    4 1 90 -1
    Stream of Events
    Numbers Table
    Getting the
    Number 3
    Time

    View Slide

  29. 30
    Confluent Cloud "The Cube" Demo
    VPC VPC KSQL CLI
    SELECT CONCAT('AND THE WINNER IS ----------> ', NAME) AS MESSAGE
    FROM SELECTED_WINNERS;

    View Slide

  30. 31
    Confluent Cloud "The Cube" Demo
    https://bit.ly/2zCjHG8

    View Slide

  31. 32
    Thank You!
    @riferrei

    View Slide