Harnessing the power of Redis and Apache Kafka to crunch high-velocity time series data (RedisConf 2021)

Processing Time Series data with Redis and Kafka Abhishek Gupta
Senior Developer Advocate, Microsoft

About Me • Focus: Kafka, Databases, Kubernetes • Blogger, (author),
OSS contributor • Lot of Java in the past. Enjoy Go, Rust

Agenda Intro Demo – yes, we dive right in! Some
food for thought Wrap up aka.ms/redis-timeseries-kafka

Time Series data: It’s everywhere • Think of it as
a Tuple (for simplicity) • A single data point: Time(stamp) and a numeric value • A Time Series: collection of many such data points

Redis joined the party! • Before: Sorted Sets, Redis Streams
• RedisTimeSeries: A native data structure • Thanks to Redis Modules!

RedisTimeSeries commands • Basic ◦ TS.CREATE ◦ TS.ADD, TS.MADD •
Query ◦ TS.GET, TS.MGET ◦ TS.RANGE, TS.MRANGE (filters) • Aggregations ◦ avg, min, max, sum, count ◦ TS.CREATERULE • Clients – Java, Go, Python etc.

Databases are just a part of the solution… • Time
Series data is: ◦ Relatively simple, but, ◦ Fast: think tens of metrics from thousands of devices/sec ◦ Big (data): think data accumulation over months • How do you collect, send all that data? ◦ Just send it directly to Redis – it’s lightning fast, right? • What we need is a data pipeline. A system to: ◦ Decouple producers, consumers ◦ Act as a buffer • Apache Kafka is a good one!

Time series processing in action!

Device monitoring: Multiple locations and devices • Monitor device metrics
- Temperature and Pressure • Time Series setup (simulated data) ◦ Name (key) - <metric>:<location>:<device> ◦ Labels (metadata) - metric, location, device ◦ Examples: ▪ TS.ADD temp:3:2 * 20 LABELS metric temp location 3 device 2 ▪ TS.ADD pressure:3:2 * 60 LABELS metric pressure location 3 device 2

High level architecture

Some food for thought

RedisTimeSeries specifics Retention policy • Maximum age for samples compared
to last event time • Time series data does not get trimmed by default Rules for down-sampling/Aggregations • TS.CREATERULE temp:1:2 temp:avg:30 AGGREGATION avg 30000 Duplicate data policy • How to handle duplicate samples? • Default: BLOCK (error out) • Other options: FIRST, LAST, MIN, MAX, SUM Source RedisLabs docs

Visualizations • Grafana dashboard powered by Redis Data Source for
Grafana • Redis Time Series adapter for Prometheus • Redis Time Series Telegraf plugin

Other considerations • Scalability - Your time series data volumes
can only move one way – up! • Long term data retention – cost-efficient storage • Integration – RedisTimeSeries connector

Key takeaways Think about: • The end-to-end data pipeline from
source to Redis and beyond • Data modeling, down-sampling and data retention

Next Steps, resources • GitHub repo: aka.ms/redis-timeseries-kafka • Azure Cache
for Redis Enterprise Tiers

Thank you.

Harnessing the power of Redis and Apache Kafka ...

Harnessing the power of Redis and Apache Kafka to crunch high-velocity time series data (RedisConf 2021)

Abhishek Gupta

More Decks by Abhishek Gupta

Featured

Transcript

Processing Time Series data with Redis and Kafka Abhishek Gupta

About Me • Focus: Kafka, Databases, Kubernetes • Blogger, (author),

Agenda Intro Demo – yes, we dive right in! Some

Time Series data: It’s everywhere • Think of it as

Redis joined the party! • Before: Sorted Sets, Redis Streams

RedisTimeSeries commands • Basic ◦ TS.CREATE ◦ TS.ADD, TS.MADD •

Databases are just a part of the solution… • Time

Time series processing in action!

Device monitoring: Multiple locations and devices • Monitor device metrics

High level architecture

Some food for thought

RedisTimeSeries specifics Retention policy • Maximum age for samples compared

Visualizations • Grafana dashboard powered by Redis Data Source for

Other considerations • Scalability - Your time series data volumes

Key takeaways Think about: • The end-to-end data pipeline from

Next Steps, resources • GitHub repo: aka.ms/redis-timeseries-kafka • Azure Cache

Thank you.