Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Analytics with Amazon Kinesis

Data Analytics with Amazon Kinesis

Suman Debnath

January 28, 2020
Tweet

More Decks by Suman Debnath

Other Decks in Technology

Transcript

  1. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Suman Debnath
    Principal Developer Advocate
    Amazon Web Services
    Data Analytics with Amazon Kinesis

    View Slide

  2. © 2020, Amazon Web Services, Inc. or its Affiliates.

    View Slide

  3. © 2020, Amazon Web Services, Inc. or its Affiliates.

    View Slide

  4. © 2020, Amazon Web Services, Inc. or its Affiliates.

    View Slide

  5. © 2020, Amazon Web Services, Inc. or its Affiliates.

    View Slide

  6. © 2020, Amazon Web Services, Inc. or its Affiliates.

    View Slide

  7. © 2020, Amazon Web Services, Inc. or its Affiliates.
    What is streaming data?
    Typical characteristics
    Low-latency
    Continuous Ordered,
    incremental
    High volume

    View Slide

  8. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Why streaming data?
    Get actionable insights quickly
    Source: Perishable insights, Mike Gualtieri, Forrester
    Real time Seconds Minutes Hours Days Months
    Value of data to decision-making
    Preventive/Predictive
    Actionable Reactive Historical
    Time critical decisions Traditional “batch” business intelligence
    Information half-life
    in decision-making

    View Slide

  9. © 2020, Amazon Web Services, Inc. or its Affiliates.
    real-time

    View Slide

  10. © 2020, Amazon Web Services, Inc. or its Affiliates.

    View Slide

  11. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Amazon Kinesis
    • Kinesis is a managed alternative to Apache Kafka
    • Great for application logs, metrics, IoT, clickstreams
    • Great for “real-time” big data
    • Great for streaming processing frameworks (Spark, NiFi, etc...)
    • Data is automatically replicated synchronously to 3 AZ
    Amazon Kinesis
    Data Streams
    Amazon Kinesis
    Data Firehose
    Amazon Kinesis
    Data Analytics
    Amazon Kinesis
    Video Streams

    View Slide

  12. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Amazon Kinesis
    Amazon Kinesis
    Amazon Kinesis
    Data Streams
    Amazon Kinesis
    Data Analytics
    Amazon Kinesis
    Data Firehose
    Amazon S3
    Amazon Redshift
    Amazon
    Elasticsearch Service

    View Slide

  13. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Streams Overview
    • Streams are divided in ordered Shards/Partitions
    • Data retention is 24 hours by default, can go up to 7 days
    • Ability to reprocess / replay data
    • Multiple applications can consume the same stream
    • Once data is inserted in Kinesis, it can’t be deleted (immutability)
    Shard 1
    Shard 2
    Shard n
    Consumer
    Producer
    Up to 1 MB or 1000
    records per second,
    per shard
    Up to 2MB per
    second, per shard

    View Slide

  14. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Streams Shards
    • One stream is made of many different shards
    • Billing is per shard provisioned, can have as many shards as you want
    • Batching available or per message calls.
    • The number of shards can evolve over time (reshard / merge)
    • Records are ordered per shard
    Shard 1
    Shard 2
    Shard n
    Consumer
    Producer

    View Slide

  15. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Streams Records
    • Data Blob –
    • Data being sent, serialized as bytes. Up to 1 MB.
    Can represent anything
    • Record Key –
    • Sent alongside a record, helps to group records in
    Shards. Same Key = Same Shard
    • Use a highly distributed key to avoid the “hot
    partition”
    • Sequence Number –
    • Unique identifier for each records put in shards.
    Added by Kinesis after ingestion
    Data Blob
    (up to 1MB)
    Bytes
    Record Key
    Record Key

    View Slide

  16. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Streams Records
    Shard A
    Shard B
    Shard N

    View Slide

  17. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Producers
    AWS SDK
    Kinesis Producer
    Library
    Kinesis Agent
    Amazon
    Kinesis Data
    Stream

    View Slide

  18. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Consumers
    Amazon
    Kinesis Data
    Stream
    AWS Lambda
    Amazon Kinesis
    Data Firehose
    AWS SDK
    Kinesis Producer
    Library
    Kinesis Agent

    View Slide

  19. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Kinesis Data Analytics

    View Slide

  20. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Demo
    Transaction Rate Alarm

    View Slide

  21. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Transaction Rate Alarm

    View Slide

  22. © 2020, Amazon Web Services, Inc. or its Affiliates.
    http://ratings.go-aws.com/rate/89

    View Slide

  23. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Stay Connected …
    /suman-d /_sumand
    Stay in touch …

    View Slide

  24. © 2020, Amazon Web Services, Inc. or its Affiliates.
    Thank you

    View Slide