$30 off During Our Annual Pro Sale. View Details »

Let's write a production-ready Kafka Streams app before the end of this talk!

Let's write a production-ready Kafka Streams app before the end of this talk!

While it's easy to get started with Kafka Streams, building a streaming application with the minimal features required for going into production is usually another story! If you plan to build a complete event-driven architecture based on many Kafka Streams microservices, you will have to know; how to handle processing failures and bad records, how to query Kafka Streams states stores, how to monitor and operate instances... Yes, it's starting to do a lot of things, doesn’t it? And sooner or later, you will probably build and maintain in-house libraries to standardize all that stuff across your projects.

In this talk, I propose to show you how to easily build a Kafka Streams application for production in just a few minutes. But before that,
we’ll explore some commons practices used to develop Kafka Streams applications. We'll review the things you have to be careful while developing. Then, I will introduce Azkarra Streams, an open-source lightweight Java framework that lets you focus on writing topologies code that matters for your business, not boilerplate code for running them!

Florian Hussonnois

February 06, 2020
Tweet

More Decks by Florian Hussonnois

Other Decks in Technology

Transcript

  1. Let's write a
    production-ready Kafka
    Streams app before the
    end of this talk!
    Florian Hussonnois,
    Co-founder, Data Streaming Engineer @StreamThoughts
    @fhussonnois
    1

    View Slide

  2. “Streaming technologies, the final frontier.
    These are the voyages of the Kafka Streams
    Users. Their 45-minutes mission: to explore
    strange new words, to seek out new pitfalls
    and new practices, to boldly go where no
    developer has gone before.”

    View Slide

  3. .
    Co-founder, Data Streaming Engineer @StreamThoughts
    Organizer Paris Apache Kafka Meetup
    Confluent Community Catalyst
    Apache Kafka Streams contributor
    Open Source Technology Enthusiastic
    3
    @fhussonnois
    About me

    View Slide

  4. .
    4
    Scottify
    The Starfleet media services provider
    db-users
    events-user-activity
    User
    Transponder
    App
    light speed streaming platform
    cross-universe topics
    Our Kafka Streams App
    KafkaStreams
    REST API
    state-full
    application
    db-albums
    What genre of music does each member
    listen to?
    (The Federation)
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  5. // Create StreamsBuilder.
    StreamsBuilder builder = new StreamsBuilder();
    // Consume all user events from input topic event-user-activity.
    KStream events = builder
    .stream(
    "event-user-activity",
    consumedEarliestWithValueSerde(newJsonSerde(UserEvent.class))
    );
    // Filter records to only keep events of type MUSIC_LISTEN_START.
    KStream songListenedEvents = allUserEvents
    .filter((userId, event) -> event.isOfType(UserEventType.MUSIC_LISTEN_START)
    .mapValues(SongListenedEvent::parseEventPayload);
    5
    Consume &
    Filter User
    Events
    1
    Input Record
    key=031,
    value={"data":"Damage, Inc; Master of Puppets","userId":"031",
    "type":"MUSIC_LISTEN_START"}

    View Slide

  6. // Previous code is omitted for clarity
    KStream songListenedEvents = ...
    // Create all GlobalKTables for albums and users.
    GlobalKTable albums = createGlobalKTable(
    builder, "db-albums", "Albums", Album.class);
    GlobalKTable users = createGlobalKTable(
    builder, "db-users", "Users", User.class);
    // Join events with Albums and Users global state stores
    KGroupedStream> groupedStreams =
    songListenedEvents
    .leftJoin(users,
    (userId, event) -> userId, /*keyValueMapper */
    Tuple::of) /*ValueJoiner */
    .leftJoin(albums,
    (userId, tuple) -> tuple.left().album, /*keyValueMapper */
    (tuple, album) -> Tuple.of(tuple.right(), album)) /*ValueJoiner */
    .groupByKey);
    6
    Enrich song
    listened
    events
    2

    View Slide

  7. // Previous code is omitted for clarity
    KGroupedStream>
    groupedStreams = ...
    // Do aggregate
    KTable kTable = groupedStreams
    .aggregate(
    UserListenCountPerGenre::new, /* initializer */
    /* aggregator */
    (userId, tuple, aggregate) -> aggregate.update(tuple.left(), tuple.right()),
    /* materialized store */
    Materialized.as("UserSongsListenedByGenre")
    .withValueSerde(newJsonSerde(UserListenCountPerGenre.class)))
    // Streams results to sink topic
    kTable.toStream()
    .to(
    "agg-listened-genres-by-user",
    Produced.with(Serdes.String(), newJsonSerde(UserListenCountPerGenre.class))
    );
    7
    Aggregate
    songs listened
    by genre
    3

    View Slide

  8. Topology topology = builder.build();
    Properties streamsProps = new Properties();
    streamsProps.put(StreamsConfig.APPLICATION_ID_CONFIG, "my-app-id");
    streamsProps.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
    streamsProps.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG,
    Serdes.StringSerde.class);
    streamsProps.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG,
    Serdes.StringSerde.class);
    KafkaStreams kafkaStreams = new KafkaStreams(topology, streamsProps);
    kafkaStreams.start();
    Runtime.getRuntime().addShutdownHook(new Thread(kafkaStreams::close));
    8
    Running
    Streams
    Topology
    4
    Output Record
    key=031,
    value={ "userName": "James Tiberius Kirk", "listenedPerGenre": {
    "Alternative hip hop": 49, "Rock": 119, "Metal": 34} }

    View Slide

  9. Writing a Kafka Streams application is
    usually not so hard (at least as much as
    your business logic).
    But writing an app for production can be
    MORE COMPLEX
    9

    View Slide

  10. .
    ● Handle streams exceptions
    ● Monitor Kafka Streams
    states & topic-partitions
    assignments
    ● Handle state stores
    lifecycle and access
    ● Write streams code to be
    testable and reusable
    ● Externalize streams
    configuration
    ● Manage properly the
    initialization of a Kafka
    Streams instance
    10
    Some principles to design a
    Kafka Streams application for
    production
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  11. KafkaStreams.start()
    “Live long, and prosper process - Spoke”
    11

    View Slide

  12. . KafkaStreams.start()
    Deep dive
    12
    StreamThread-0 StreamThread-1
    GlobalStreamThread
    configuration
    num.stream.threads=2
    ❏ Before starting
    KafkaStreams creates
    and configure all
    internal threads
    0
    state=CREATED state=CREATED state=CREATED
    KafkaStreams.state() =
    CREATED
    db-albums
    db-users
    P0 P1
    event-user-activity
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  13. .
    db-albums
    db-users
    KafkaStreams.start()
    Deep dive
    13
    StreamThread-0 StreamThread-1
    GlobalStreamThread
    ❏ KafkaStreams will
    start the
    GlobalStreamThread
    1
    start()
    state=CREATED
    GlobalState
    Maintainer
    global-consumer
    state=CREATED state=CREATED
    StateRestore
    Callback
    GlobalStore
    KafkaStreams.state() =
    REBALANCING
    P0 P1
    event-user-activity
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  14. .
    db-users
    KafkaStreams.start()
    Deep dive
    14
    StreamThread-0 StreamThread-1
    GlobalStreamThread
    ❏ Once all global state
    stores are restored,
    consumer is assigned
    to all partitions from all
    source topics.
    1
    start()
    state=RUNNING
    GlobalState
    Maintainer
    GlobalStore
    global-consumer
    state=CREATED state=CREATED
    Processor
    KafkaStreams.state() =
    REBALANCING
    db-albums
    P0 P1
    event-user-activity
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  15. .
    db-albums
    db-users
    KafkaStreams.start()
    Deep dive
    15
    StreamThread-0 StreamThread-1
    GlobalStreamThread
    ❏ KafkaStreams starts all
    StreamThreads.
    ❏ Each consumer
    subscribes to source
    topics and starts pollings.
    2
    start()
    state=RUNNING
    GlobalState
    Maintainer
    Store
    global-consumer
    state=PARTITIONS_
    REVOKED
    state=PARTITIONS_
    REVOKED
    Processor
    P0 P1
    consumer consumer
    (subscribe) (subscribe)
    start()
    consumer-group
    KafkaStreams.state() =
    REBALANCING
    event-user-activity
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  16. .
    db-albums
    db-users
    KafkaStreams.start()
    Deep dive
    16
    GlobalStreamThread
    ❏ Part of rebalancing
    protocol, each
    StreamThread is assigned
    to Tasks (i.e
    topic-partitions)
    2
    state=RUNNING
    GlobalState
    Maintainer
    Store
    global-consumer
    Processor
    StreamThread-0 StreamThread-1
    state=PARTITIONS_
    ASSIGNED
    state=PARTITIONS_
    ASSIGNED
    P0 P1
    consumer consumer
    (assigned) (assigned)
    Store
    Task 0_1
    Store
    Task 0_0
    KafkaStreams.state() =
    REBALANCING
    event-user-activity
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  17. . KafkaStreams.start()
    Deep dive
    17
    GlobalStreamThread
    ❏ Each Task is restored.
    ❏ One consumer is
    dedicated, per Thread, to
    restore state-stores
    3
    state=RUNNING
    GlobalState
    Maintainer
    Store
    global-consumer
    Processor
    P0 P1
    KafkaStreams.state() =
    REBALANCING
    event-user-activity
    db-albums
    db-users
    StreamThread-0 StreamThread-1
    state=PARTITIONS_
    ASSIGNED
    state=PARTITIONS_
    ASSIGNED
    consumer consumer
    (assigned) (assigned)
    Task 0_1
    Task 0_0
    Store Store
    restore-consumer restore-consumer
    StateRestore
    Callback
    StateRestore
    Callback
    changelog-store-p0 changelog-store-p1
    State Store Recovering
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  18. .
    db-albums
    db-users
    KafkaStreams.start()
    Deep dive
    18
    GlobalStreamThread
    ❏ StreamThread are
    running when all Tasks
    (active/standby) are
    restored.
    4
    state=RUNNING
    GlobalState
    Maintainer
    GlobalStore
    global-consumer
    Processor
    StreamThread-0 StreamThread-1
    state=RUNNING state=RUNNING
    P0 P1
    consumer consumer
    (assigned) (assigned)
    Store
    Task 0_1
    Store
    Task 0_0
    KafkaStreams.state() =
    REBALANCING
    event-user-activity
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  19. KafkaStreams.state() == State.RUNNING
    (i.e our streams application is now up and running)
    19

    View Slide

  20. . Issue #1
    Messages are NOT
    processed while state
    stores are recovering
    20
    20
    ❏ A StreamTask can’t start processing
    message until all its states stores are
    fully recovered.
    ❏ This guarantees the consistency of
    returned data.
    source topic
    last committed
    offset
    1 1 2
    changelog topic
    P0 P0
    Task 0
    internal restore
    consumer position
    1
    1
    1
    1 ← kv.get( )
    This can’t
    happen !
    Availability vs
    Consistency
    (state-data)
    State
    kv.put( , 2 )
    current
    position

    View Slide

  21. . Issue #1
    Messages are NOT
    processed while state
    stores are recovering
    21
    21
    ❏ A StreamTask can’t start processing
    message until all its states stores are
    fully recovered.
    ❏ This guarantees the consistency of
    returned data.
    source topic
    last committed offset
    1 1 2
    changelog topic
    P0 P0
    Task 0
    internal restore
    consumer position
    1
    1
    1
    1 ← kv.get( )
    This can’t
    happen!
    Availability vs
    Consistency
    (state-data)
    State
    kv.put( , 2 )
    As a result, long recovering
    process can significantly
    increase consumer lags

    View Slide

  22. . Monitoring State Store Recovering
    Using a Global listener
    22
    StateRestoreListener listener = new StateRestoreListener() {
    @Override
    public void onRestoreStart(
    TopicPartition topicPartition, String storeName, long startingOffset, long endingOffset) {
    }
    @Override
    public void onBatchRestored(
    TopicPartition topicPartition, String storeName, long batchEndOffset, long numRestored) {
    }
    @Override
    public void onRestoreEnd(
    TopicPartition topicPartition, String storeName, long totalRestored) {
    }
    };
    streams.setGlobalStateRestoreListener(globalListener);
    Listen to all state
    stores restoration
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  23. . Monitoring State Store Recovering
    Using Consumer Metrics
    Can help to monitor recovering performance :
    MBean: kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.w]+),topic=([-.w]+)
    The average number of :
    ❏ bytes consumed per second for a topic: bytes-consumed-rate
    ❏ bytes fetched per request for a topic: fetch-size-avg
    ❏ records in each request for a topic: records-per-request-avg
    ❏ records consumed per second for a topic: records-consumed-rate
    23
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  24. . Issue #2
    Things may be blocking
    StreamsBuilder builder = new StreamsBuilder();
    //...
    builder.globalTable("input-topic");
    Topology topology = builder.build();
    new KafkaStreams(topology, streamsConfig).start();
    Depending on your code, this can block your entire
    application, and perhaps lead to an application crash
    (or timeout).
    24
    https://issues.apache.org/jira/browse/KAFKA-7380
    This can actually block!
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  25. . Issue #2
    ...a better way
    final KafkaStreams streams = new KafkaStreams(topology, streamsConfig);
    ExecutorService executor = Executors.newSingleThreadExecutor(r -> {
    final Thread thread = new Thread(r, "streams-starter");
    thread.setDaemon(false);
    return thread;
    });
    CompletableFuture.supplyAsync(() -> {
    streams.start();
    return streams.state();
    }, executor);
    25
    Set as user-thread, indeed internal
    StreamsThreads will inherit from this
    one.
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  26. How to turn our application into
    distributed queryable KV store ?
    “Change is the essential process of all existence.
    — Spock”
    26

    View Slide

  27. . Interactive Queries
    In 30 seconds
    27
    ❏ IQ allows direct access to
    local states
    ❏ Read-only access
    ❏ Simplifies architecture by
    removing external DB
    needs
    StreamThread-0 StreamThread-1
    consumer consumer
    Store
    Task 0_1
    Store
    Task 0_0
    KafkaStreams
    API
    User Service API
    UI REST
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  28. . Interactive Queries
    In 30 seconds
    28
    StreamThread-0 StreamThread-1
    consumer consumer
    Store
    Task 0_1
    Store
    Task 0_0
    KafkaStreams
    API
    User Service API
    UI
    28
    ReadOnlyKeyValueStore
    store = streams.store(
    "Album",
    QueryableStoreTypes.keyValueStore()
    );
    Album value = store.get("Ok Computer");
    ❏ Simple API that works for
    DSL and Processor API

    View Slide

  29. . Issue #1
    State stores can’t be
    queried while recovering
    29
    29
    BAD
    streams.start();
    ReadOnlyKeyValueStore store;
    while (true) {
    try {
    store = streams.store(
    "Albums",
    QueryableStoreTypes.keyValueStore()
    );
    break;
    } catch (InvalidStateStoreException ignored) {
    // wait...store not ready yet
    Time.SYSTEM.sleep(Duration.ofSeconds(1));
    }
    }
    // do something useful with the store…
    ❏ Developers have to check for
    state-store availability
    ⚠ Please don’t use infinite loop to wait
    for a state store to be ready.
    Caveats:
    ❏ StreamThread could be DEAD
    ❏ Bad store name is provided
    ❏ Useless for GlobalStateStore
    ❏ State may have migrated

    View Slide

  30. . Issue #1
    State stores can’t be
    queried while recovering
    30
    30
    OK
    streams.start();
    // Global state store can be queried from here
    if (streams.state().isRunning() /* rebalancing*/ ) {
    ReadOnlyKeyValueStore store =
    streams.store(
    “Albums”
    QueryableStoreTypes.keyValueStore()
    );
    // do something useful with the store...
    }
    // Wait for all StreamThreads to be ready
    while (streams.state() != KafkaStreams.State.RUNNING) {
    Time.SYSTEM.sleep(Duration.ofSeconds(1));
    }
    // all local state stores are now recovered
    ❏ Developers have to check for
    state-store availability
    ⚠ Please don’t use infinite loop to wait
    for a state store to be ready.
    Caveats:
    ❏ StreamThread could be DEAD
    ❏ Bad store name is provided
    ❏ Useless for GlobalStateStore
    ❏ State may have migrated

    View Slide

  31. . Issue #1
    State stores can’t be
    queried while recovering
    31
    31
    OK
    streams.start();
    // Global state store can be queried from here
    if (streams.state().isRunning() /* rebalancing*/ ) {
    ReadOnlyKeyValueStore store =
    streams.store(
    "globalStoreName",
    QueryableStoreTypes.keyValueStore()
    );
    // do something useful with the store...
    }
    // Wait for all StreamThreads to be ready
    while (streams.state() != KafkaStreams.State.RUNNING) {
    Time.SYSTEM.sleep(Duration.ofSeconds(1));
    }
    // all local state stores are now recovered
    ❏ Developers have to check for
    state-store availability
    ❏ Please don’t use infinite loop to wait
    for a state store to be ready.
    Caveats:
    ❏ StreamThread could be DEAD
    ❏ Bad store name is provided
    ❏ Useless for GlobalStateStore
    This only works for single
    streams instance!

    View Slide

  32. . Scaling our Application
    Distributed States
    32
    StreamThread-0 StreamThread-0
    P0 P1
    consumer consumer
    Store
    Task 0_1
    Task 0_0
    source topic
    K V
    Kirk Metal
    Saru Classical
    Store
    Spock Rock
    Picard Electro
    K V
    ❏ Tasks are spread
    across
    instances/threads
    ❏ Each instance own
    a state-store
    sub-set
    application.server=
    localhost:8080
    application.server=
    localhost:8082
    Instance 1
    (JVM)
    Instance 2
    (JVM)
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  33. . Issue #2
    Discovery API
    33
    33
    As developers you will have to :
    ❏ Query local instance for metadata.
    ❏ Discover which instance has the data
    you are looking for.
    ❏ Query either local or remote state store.
    Caveats:
    A lot of boilerplate code
    But bad things may happens...
    // Discover which instance hosts the the key-value
    StreamsMetadata metadata = streams.metadataForKey(
    "UserSongsListenedByGenre",
    "Picard",
    Serdes.String().serializer()
    );
    // Check if key-value is hosted by the local instance
    if (isLocalHost(metadata.hostInfo())) {
    ReadOnlyKeyValueStore store =
    kafkstreamsaStreams.store(
    "UserSongsListenedByGenre",
    QueryableStoreTypes.keyValueStore()
    );
    store.get("Picard");
    } else
    forwardInteractiveQuery(metadata.hostInfo());

    View Slide

  34. . Issue #2
    Scaling Up or Down
    34
    StreamThread-0
    P0 P1
    consumer
    source topic
    Instance 1
    (JVM)
    application.server=
    localhost:8080
    Task 0_0
    Task 0_1
    Store
    K V
    Kirk Metal
    Saru Classical
    Spock Rock
    Picard Electro
    Instance 2
    (JVM)
    StreamThread-0
    application.server=
    localhost:8082
    STARTING
    consumer
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  35. . Issue #2
    Scaling Up or Down
    35
    StreamThread-0
    P0 P1
    consumer
    source topic
    Instance 1
    (JVM)
    application.server=
    localhost:8080
    Task 0_0
    Task 0_1
    Store
    K V
    Kirk Metal
    Saru Classical
    Spock Rock
    Picard Electro
    StreamThread-0
    consumer
    Instance 2
    (JVM)
    changelog
    topic
    Store
    K V
    Spock Rock
    application.server=
    localhost:8082
    Task 0_1 NEW
    Task Migration
    Recovering...
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  36. .
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.
    Issue #2
    Scaling Up or Down
    36
    StreamThread-0
    P0 P1
    consumer
    source topic
    Instance 1
    (JVM)
    application.server=
    localhost:8080
    Task 0_0
    Task 0_1
    Store
    K V
    Kirk Metal
    Saru Classical
    Spock Rock
    Picard Electro
    StreamThread-0
    consumer
    Instance 2
    (JVM)
    changelog
    topic
    Store
    K V
    Spock Rock
    application.server=
    localhost:8082
    Task 0_1 NEW
    Task Migration
    Recovering...
    Code will throw an
    InvalidStateStoreException
    (usually transient failure)

    View Slide

  37. . Issue #3
    Instance Failure
    37
    StreamThread-0
    P0 P1
    consumer
    source topic
    Instance 1
    (JVM)
    application.server=
    localhost:8080
    Task 0_0
    Store
    K V
    Kirk Metal
    Saru Classical
    StreamThread-0
    consumer
    Instance 2
    (JVM)
    changelog
    topic
    Store
    K V
    Spock Rock
    Picard Electro
    application.server=
    localhost:8082
    Task 0_1
    crash
    / network
    outage
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  38. . Issue #3
    Instance Failure
    38
    StreamThread-0
    P0 P1
    consumer
    source topic
    Instance 1
    (JVM)
    application.server=
    localhost:8080
    Task 0_0
    Store
    K V
    Kirk Metal
    Saru Classical
    StreamThread-0
    consumer
    Instance 2
    (JVM)
    changelog
    topic
    Store
    K V
    Spock Rock
    Picard Electro
    application.server=
    localhost:8082
    Task 0_1
    session.timeout.ms
    (default 10 seconds)
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  39. . Issue #3
    Instance Failure
    39
    StreamThread-0
    P0 P1
    consumer
    source topic
    Instance 1
    (JVM)
    application.server=
    localhost:8080
    Task 0_0
    Store
    K V
    Kirk Metal
    Saru Classical
    StreamThread-0
    consumer
    Instance 2
    (JVM)
    changelog
    topic
    Store
    K V
    Spock Rock
    Picard Electro
    application.server=
    localhost:8082
    Task 0_1
    java.net.ConnectException
    (while state is not re-assigned)
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  40. 40
    Your app must handle state stores
    unavailability, not the client querying it!
    40

    View Slide

  41. Handling Exceptions
    “It is possible to commit no mistakes and still
    lose. That is not weakness, that is life.” –
    Jean-Luc Picard
    41

    View Slide

  42. .
    42
    Unexpected Messages
    or how to lose your app...
    !
    ! !
    db_users
    event_user_activity
    User
    Transponder
    App
    The bad guys
    KafkaStreams
    REST API
    Our
    application
    ! !
    db_albums
    (Klingon Empire)
    (The Federation)
    !
    DeserializationException
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  43. . Solution #1 Skip or Fail
    Built-in mechanisms
    43
    default.deserialization.exception.handler
    ❏ CONTINUE: continue with processing
    ❏ FAIL: fail the processing and stop
    Two available implementations :
    ❏ LogAndContinueExceptionHandler
    ❏ LogAndFailExceptionHandler
    Not really suitable for production.
    Cannot monitor efficiently corrupted
    messages
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  44. . Solution #2
    Dead Letter Topic
    44
    44
    Solution #3
    Sentinel Value
    DeserializationExceptionHandler Deserializer
    ! !
    ! !
    Handler
    ?
    Source Topic
    Topology
    ! !
    Source Topic SafeDeserializer
    Inner
    Deserializer
    (null)(null)
    Catch any exception thrown during
    deserialization and return a default value
    (e.g: null, “N/A”, etc).
    (skip)
    Dead Letter topic
    Send corrupted messages to a
    special topic.

    View Slide

  45. . Best Practices
    How to send corrupted messages
    ❏ Never change the schema/format of the corrupted message.
    ❏ Send the original message as it is in the DLT.
    ❏ Use Kafka Headers to trace exception cause and origin.
    45
    Kafka Message
    raw key
    raw value
    original topic / partition / offset
    exception trace
    app info (id, host, version)
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  46. Ok, Let's summarize!
    46

    View Slide

  47. . Business Value vs Effort
    47
    Topology Definition
    Business Value
    High
    Kafka Streams Management
    IQ
    Error Handling
    logic
    Monitoring /
    Health-check
    Security
    Configuration
    Externalization
    Low
    Effort
    Low/Medium
    High
    Streams
    Lifecycle
    Kafka Streams Application
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  48. . Business Value vs Effort
    48
    Topology Definition
    Business Value
    High
    Kafka Streams Management
    IQ
    Error Handling
    logic
    Monitoring /
    Health-check
    Security
    Configuration
    Externalization
    Low
    Effort
    Low/Medium
    High
    Streams
    Lifecycle
    Kafka Streams Application
    Eventually, sooner or later, you'll write your own Kafka
    Streams framework to wrap all that stuff!
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  49. Consistency
    Without common practices
    and libraries, teams have to
    (re)write new classes for
    handling errors, Kafka
    Streams startup and
    Interactive Queries for each
    project , etc
    49
    Updatability
    Kafka Streams is evolving
    fast, with new features, bug
    fixes and optimizations.
    Maintaining multiple
    applications up-to-date with
    the latest version can be
    challenging.
    We never build a single Kafka Streams
    apps, but dozens each running across
    multiple instances.
    Operability
    Kafka Streams can be
    challenging for operations
    teams. Make easy to monitor
    streams instance using
    standard API, across all
    projects, will help to keep the
    system running smoothly.

    View Slide

  50. 50
    (Azkarra is Basque word for "Fast")
    50

    View Slide

  51. . Yet Another Micro-Framework
    a lightweight framework that makes easy to create
    production-ready Kafka Streams applications.
    ❏ Open-source since 2019 November under
    Apache-2.0 License.
    ❏ Written in Java.
    Add a star to the GitHub project, it only
    takes 5 seconds ^^
    51
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  52. . Key Features
    Developer Friendly
    Production-Ready
    Secured
    Easy to learn API
    Configuration Externalization
    REST APIs for Interactive Queries
    SSL/TLS, Basic Authentication
    Built-in Healthchecks
    REST Endpoints for Metrics (JSON / Prometheus)
    Error handling Logics
    52

    View Slide

  53. . How to use It ?
    53

    io.streamthoughts
    azkarra-streams
    0.6.1

    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  54. . Concept
    TopologyProvider
    54
    54
    public interface TopologyProvider
    extends Provider {
    /**
    * Supplies a new Kafka Streams {@link Topology}
    * instance.
    *
    * @return the {@link Topology} instance.
    */
    Topology get();
    /**
    * Returns the version of the supplied
    * {@link Topology}.
    *
    * @return the string version.
    */
    @Override
    String version();
    }
    A simple interface to implement
    ❏ Each topology must be versioned.
    ❏ Use to provide the Topology instance.
    1

    View Slide

  55. . Concept
    StreamsExecutionEnvironment
    55
    // Configure environment specific properties
    Conf config = …
    // Create a new environment to manage lifecycle of one or many KafkaStreams instances.
    StreamsExecutionEnvironment env = DefaultStreamsExecutionEnvironment.create("dev-env")
    .setConfiguration(config)
    .setKafkaStreamsFactory(/**/)
    .setRocksDBConfig(RocksDBConfig.withStatsEnable())
    .setApplicationIdBuilder(/**/)
    .addGlobalStateListener(/**/)
    .addStateListener(/**/)
    // Register the topology
    env.addTopology(CountUserListenMusicPerGenreTopology::new, Executed.as("scottify-streams"));
    2
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  56. . Concept
    StreamsLifecycleInterceptor
    56
    public interface StreamsLifecycleInterceptor {
    /**
    * This method is executed before starting the streams instance.
    */
    default void onStart(StreamsLifecycleContext context,
    StreamsLifecycleChain chain) {
    chain.execute();
    }
    /**
    * This method is executed before stopping the streams instance.
    */
    default void onStop(StreamsLifecycleContext context,
    StreamsLifecycleChain chain) {
    chain.execute();
    }
    }
    AutoCreateTopicsInterceptor
    WaitForSourceTopicsInterceptor
    Built-in interceptors
    3
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  57. . Concept
    AzkarraContext
    57
    Registering Custom
    ExecutionEnvironment
    context.addExecutionEnvironment(env)
    Using a default
    ExecutionEnvironment
    context.addTopology(
    CountUserListenMusicPerGenreTopology.class,
    "dev-env"
    Executed.as("scottify-streams")
    )
    // Configure context specific properties
    Conf ctxConf = …
    AzkarraContext context = DefaultAzkarraContext.create(ctxConf)
    4
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  58. . Concept
    AzkarraApplication
    58
    //Provide configuration externalization (using
    Lightbend Config library).
    Conf appConf =
    AzkarraConf.create("application")
    new AzkarraApplication()
    .setConfiguration(appConf)
    .setContext(context)
    .setBannerMode(Banner.Mode.OFF)
    .enableHttpServer(true)
    .setRegisterShutdownHook(true)
    .run(args);
    58
    Provides additional features on top-off
    AzkarraContext :
    ❏ Embedded HTTP Server
    ❏ Component Scan (scan current package for
    classes annotated @Component - e.g:
    TopologyProvider)
    ❏ Application auto-configuration
    5

    View Slide

  59. . Interactive Queries
    HTTP Endpoint
    59
    Request
    $ POST /api/v1/applications/:application_id
    /stores/:store_name
    {
    "set_options": {
    "query_timeout_ms": 1000,
    "retries": 100,
    "retry_backoff_ms": 100,
    "remote_access_allowed": true
    },
    "type": "key_value",
    "query": {
    "get": {
    "key": "001"
    }
    }
    }
    59
    Response
    {
    "took": 1,
    "timeout": false,
    "server": "localhost:8080",
    "status": "SUCCESS",
    "total": 1,
    "result": {
    "success": [
    {
    "server": "localhost:8080",
    "remote": false,
    "records": [
    {
    "key": "001",
    "value": {
    "name": "James Tiberius Kirk",
    "gender": "Male",
    "species": "Human",
    "key": "001"
    }
    }
    ]
    ...

    View Slide

  60. DEMO

    View Slide

  61. . Azkarra WebUI
    61

    View Slide

  62. . Azkarra WebUI
    62

    View Slide

  63. . Azkarra WebUI
    63

    View Slide

  64. . Additional Resources
    Code Source
    ❏ https://github.com/streamthoughts/demo-kafka-streams-scottify
    Azkarra Streams
    ❏ https://streamthoughts.github.io/azkarra-streams/
    ❏ https://medium.com/streamthoughts/introducing-azkarra-streams-the-first-micro-framework-for-apache-kafka-streams-e1
    3605f3a3a6
    ❏ https://dev.to/fhussonnois/create-kafka-streams-applications-faster-than-ever-before-via-azkarra-streams-3nng
    64
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  65. . Images & Icons
    Images
    Photo by Bryan Goff on Unsplash
    Photo by Stefan Cosma on Unsplash
    Photo by Jordan Whitfield on Unsplash
    Icons
    https://en.wikipedia.org/wiki/Starfleet
    https://en.wikipedia.org/wiki/Klingon
    65
    © streamthoughts. All rights reserved.
    Not to be reproduced in any form without prior written consent.

    View Slide

  66. Thank you
    @streamthoughts
    www.streamthoughts.io
    [email protected]

    View Slide