Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making Stream Processing the Right Way in the Cloud using Apache Kafka

Making Stream Processing the Right Way in the Cloud using Apache Kafka

Nov 13th, 2018: Apache Kafka® Meetup in Raleigh, NC

Ricardo Ferreira

November 13, 2018
Tweet

More Decks by Ricardo Ferreira

Other Decks in Technology

Transcript

  1. 1
    Making Stream Processing
    the Right Way in the Cloud
    using Apache Kafka®
    Raleigh Apache Kafka Meetup
    Nov 13th, 2018 | Ricardo Ferreira
    [email protected]
    @riferrei

    View full-size slide

  2. 2
    I want to start off by
    telling you a story…

    View full-size slide

  3. 3
    The story
    about Bob…
    Bob Sheppard,
    20 years.
    …as a
    Software
    Engineer.

    View full-size slide

  4. 4
    …and how his story
    will help us with Kafka
    in the Cloud.
    Bob Sheppard,
    24 years

    View full-size slide

  5. 5
    In the following events – Bob
    will be depicted in the story as
    a regular Software Engineer.

    View full-size slide

  6. 6
    One day during a sprint planning…
    Product Owner: "Guys, I need an
    estimate about how long would take
    to put Apache Kafka in the Cloud"
    Software Engineer: "Sure Mr.
    Product Owner. Give me a day so I
    can come up with something solid
    to include in your plan."
    Product Owner: "And be as
    accurate as possible because
    DEV needs this for coding."
    "Hahaha"

    View full-size slide

  7. 7
    Workflow of the Software Engineer evaluating the complexity…
    Hello World with
    Apache Kafka on his
    laptop. Easy peasy!
    Handling data with
    producers and
    consumers. Done!
    Enough from this
    child thing: loading
    data from PROD
    Updating my
    LinkedIn profile with
    a new Skill…
    Sweet: Confluent has
    several Docker
    Images pre-built.
    Build a Docker image
    from the Hello World
    and run in the Cloud

    View full-size slide

  8. 8
    …and in the next day…

    View full-size slide

  9. 9
    During the next day meeting…
    Product Owner: "That is fantastic.
    Gonna write this down here with
    blood and fire, plus recording all…"
    Software Engineer: "Mr. Product
    Owner, I can easily put Apache Kafka
    in the Cloud in a couple weeks…"
    "Hahaha"

    View full-size slide

  10. 10
    …and throughout the two weeks, there was lots of interesting findings…
    Zookeeper ports
    cannot be exposed?
    Keeping data secure
    while in-transit?
    Keeping data
    secure at-rest?
    Plan ahead IP addresses
    to avoid overlapping?
    Enabling remote
    SSH access?
    Developer authentication should
    use BASIC AUTH or OAuth 2.0?
    Partitions auto-
    rebalancing?
    How do I prevent people to
    change the infrastructure?
    Who should own the
    infrastructure?
    How do I manage the upgrades of
    the Apache Kafka binaries?
    SOC-2 Type II
    What if I have a hybrid
    center of gravity?
    Avoid vendor lock-in or
    sell my soul to XYZ?
    Scale in or scale out?
    And without downtime?

    View full-size slide

  11. 11
    … we all know how this story ends…

    View full-size slide

  12. 12
    During Friday around 5:30 PM…

    View full-size slide

  13. 13
    If you didn't get the joke,
    please grab your phone
    and buy this movie now.
    As an IT professional,
    you owe yourself this.

    View full-size slide

  14. 14
    Bob's story teaches us three important lessons:
    You don't want to
    build infrastructure.
    You want to build
    software.
    1
    Distributed Systems
    are hard to manage.
    And Cloud makes it
    even harder.
    2
    Do you really know
    who Bob was in this
    story? No? Then
    check it out…
    3

    View full-size slide

  15. 15
    About me:
    • Developer Advocate @ Confluent
    • Having Fun with Coding since 1997
    • Ex-Oracle, Red Hat, IONA Technologies
    • Blog: https://riferrei.net
    • Twitter: @riferrei
    o Geek stuff and Apache Kafka®
    o DC/Marvel, Mindhunter Book
    • Brazilian, Husband and Father

    View full-size slide

  16. 16
    Options for Apache Kafka in the Cloud
    Running and
    Managing by
    Yourself
    1
    Running with
    Docker and
    Kubernetes
    2
    Infrastructure
    as Code,
    Ansible, Chef
    3
    Apache
    Kafka as a
    Service
    4

    View full-size slide

  17. 17
    Running and Managing by Yourself
    Running and
    Managing by
    Yourself
    1
    • Pretty much you need to build everything;
    except of course for Apache Kafka.
    • Pros:
    o Ability to own and control everything.
    o Lots of learning sources out there.
    • Cons:
    o Extremely complex to manage and scale.
    o High cost with TCO, infrastructure costs.

    View full-size slide

  18. 18
    Running with Docker and Kubernetes
    Running with
    Docker and
    Kubernetes
    2
    • Makes transparent where Apache Kafka is
    running; whether On-Premise or in the Cloud.
    • Pros:
    o Abstracts cluster and deployment details.
    o Tooling support: Confluent Operator, Pivotal
    PKS, Red Hat OpenShift, Cloud Extensions.
    • Cons:
    o Underlying infrastructure is still necessary.
    o There is people and infrastructure costs.

    View full-size slide

  19. 19
    Running with Docker and Kubernetes
    • Makes transparent where Apache Kafka is
    running; whether On-Premise or in the Cloud.
    • Pros:
    o Abstracts cluster and deployment details.
    o Tooling support: Confluent Operator, Pivotal
    PKS, Red Hat OpenShift, Cloud Extensions.
    • Cons:
    o Underlying infrastructure is still necessary.
    o There is people and infrastructure costs.

    View full-size slide

  20. 20
    Infrastructure as Code, Ansible, Chef
    Infrastructure
    as Code,
    Ansible, Chef
    3
    • Makes transparent where Apache Kafka is
    running; as well as all the infrastructure details.
    • Pros:
    o Abstracts cluster and deployment details.
    o Creates a immutable, repeatable, disposable
    infrastructure that can be managed as code.
    • Cons:
    o Vendor lock-in, provisioning complexity.
    o There is people and infrastructure costs.

    View full-size slide

  21. 21
    Apache Kafka as a Service
    Apache
    Kafka as a
    Service
    4
    • Apache Kafka is delivered to you as SaaS;
    leaving more time for coding and innovation.
    • Pros:
    o SLAs about performance and availability.
    o Pay-as-you-Go, predictable costs up ahead.
    o Multiple Cloud vendor choices = No lock-in.
    • Cons:
    o You don't have control over anything.
    o Something new to learn and be good at it.

    View full-size slide

  22. 22
    Apache Kafka as a Service = Confluent Cloud™
    • Meet Confluent Cloud: a scalable streaming
    data service based on Apache Kafka that is
    delivered to you 100% as a service.
    • Supported by the best Kafka engineers.
    • Best hybrid center of gravity support.
    • Multi-Cloud support: AWS, GCP, Others.
    • Apache Kafka APIs = No proprietary stuff.
    • Access to the Confluent Platform Tools
    • Professional and Enterprise Plans.

    View full-size slide

  23. 24
    Confluent Cloud Tools Project
    • How about using the Confluent Platform
    ecosystem in a snap with your cluster?
    • Say hello to Confluent Cloud Tools:
    • https://github.com/confluentinc/ccloud-tools
    • Creates a highly-available, Multi-AZ, fully
    secure within a VPC – set of tools that are
    automatically connected into you Confluent
    Cloud cluster.

    View full-size slide

  24. 25
    Confluent Cloud Tools Project
    VPC VPC
    Availability Zone 1
    Private Subnet
    Schema
    Registry 1
    Private Subnet
    REST
    Proxy 1
    Private Subnet
    KSQL
    Server 1
    Availability Zone 2
    Private Subnet
    Schema
    Registry 2
    Private Subnet
    REST
    Proxy 2
    Private Subnet
    KSQL
    Server 2
    Multi AZ for HA
    Public Subnet
    Public Internet
    VPC Peering
    OR
    Scaling Out
    Across AZ's
    Schema
    Registry
    REST
    Proxy
    KSQL
    Server
    Bastion
    Server

    View full-size slide

  25. 27
    "The Cube"
    Demo

    View full-size slide

  26. 29
    Confluent Cloud "The Cube" Demo
    Name Motion (X) Motion (Y) Motion (Z)
    Alice -45 0 12
    Bob 1 -185 -90
    John 0 -1 90
    Steve -180 0 180
    Number X Y Z
    1 1 0 0
    2 1 -90 1
    3 -180 0 180
    4 1 90 -1
    Stream of Events
    Numbers Table
    Getting the
    Number 3
    Time

    View full-size slide

  27. 30
    Confluent Cloud "The Cube" Demo
    VPC VPC KSQL CLI
    SELECT CONCAT('AND THE WINNER IS ----------> ', NAME) AS MESSAGE
    FROM SELECTED_WINNERS;

    View full-size slide

  28. 31
    Confluent Cloud "The Cube" Demo
    https://bit.ly/2zCjHG8

    View full-size slide

  29. 32
    Thank You!
    @riferrei

    View full-size slide