Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Processing 60 million messages a day using Redi...

Avatar for Luciano Luciano
November 12, 2019

Processing 60 million messages a day using Redis: Lessons from the trenches

This is a tale about revisiting the architecture decisions we’ve made to build a platform to process more than 60 million of messages a day, introducing the new architectural design and how we choose the right tool to model or problem, beyond some redesign workout we decide to move from Apache Cassandra to Redis as our main database solution.

Wavy Global is part of the Movile group, one of the earliest brazilian unicorns, and is the leading message broker for Latin America, providing customer engagement through channels such as SMS, WhatsApp, Apple Business Chat and RCS. Our core platform process more than 2 billions of events per month using Redis, reaching some peaks of more than 15k events/sec. We will explain why we moved from Apache Cassandra to Redis and the lessons we have learned using Redis with a very intensive throughput to achieve our strict SLA requirements.

Avatar for Luciano

Luciano

November 12, 2019
Tweet

Other Decks in Technology

Transcript

  1. PRESENTED BY 1 Messaging use case One of ours platform

    use cases 2 Problem statement The problem behind the use case 3 Legacy Architecture / New Architecture How we solved the problem in the past and how we have been solving today Agenda: 3 Results and Lessons Learned Our achieved results and lessons
  2. PRESENTED BY Airline Company informs: You could start your check-in

    process for the flight ABCDE123. 100852 Air Company
  3. PRESENTED BY Sending messages Airline Company informs: You could start

    your check-in process for the flight ABCDE123. 100852 Wavy's Platform Mobile Network Operator Air Company Air Company
  4. PRESENTED BY Receiving delivery receipts Mobile Network Operator Airline Company

    informs: You could start your check-in process for the flight ABCDE123. 100852
  5. PRESENTED BY Receiving delivery receipts Wavy's Platform Mobile Network Operator

    Airline Company informs: You could start your check-in process for the flight ABCDE123. 100852
  6. PRESENTED BY Receiving delivery receipts Wavy's Platform Mobile Network Operator

    Airline Company informs: You could start your check-in process for the flight ABCDE123. 100852 Air Company Air Company
  7. PRESENTED BY Challenges Wavy`s Platform MNO ~ 30 million of

    DR/day ~ 30 million of SMS/day Slow "Realtime" Matching
  8. PRESENTED BY • A lot of put-and-get overhead due to

    queues usage. • Apache Cassandra is not the most suitable for this task: ◦ High delete rate ◦ Slow table scans ◦ Consistency (CAP -> AP) Legacy Architecture - Drawbacks
  9. PRESENTED BY We had a great improvement in our SLA,

    from minutes to milliseconds. Reaching peaks of more than 23k ops/s. Results
  10. PRESENTED BY Results Using Redis we need less than ⅓

    of the original Apache Cassandra cluster: • Relevant saving in hardware and operations costs Lua Scripts solved our consistency problems: • We solved complains on lost DRs • Cassandra replication
  11. PRESENTED BY Lessons learned - Tooling Use the right tool

    for the job: • Streaming technology instead of queue for multiple consumers • Redis instead of Apache Cassandra for ephemeral data with consistency needs and high throughput
  12. PRESENTED BY Lessons Learned - Redis Have a Safety Margin

    in your cluster: • In high throughput, Redis eviction might be slower than your application throughput
  13. PRESENTED BY Optimize what you save: • Discard unused fields

    • Remove of metadata (if you can do it) • Rename fields • JSON + Gzip <3 Payload size: 3 KB => 1KB Avg Usage of Memory (normal operation): 36 GB => 12 GB Lessons Learned - Redis