Journey to the Real-Time Analytics in Extreme Growth

22d9eb22713520cfb9df28f6b1ce7f83?s=47 AppsFlyer
September 22, 2016

Journey to the Real-Time Analytics in Extreme Growth

At AppsFlyer we provide a real-time analytics dashboard for Marketers. With our dashboard they invest $$$ budgets wisely. We aggregate some 8 billion daily events in real-time and our solution could not handle this load - dashboard just loaded forever and the Kafka lags were our daily and nightly headache. Product constantly demanded new features and guess what - we just couldn't do it! Moreover, we faced dangerous failures and the risk of losing serious data - something we obviously couldn't afford to do.
We started looking for a new infrastructure: We tried different databases and technologies and none of them provided the desired solution. We tried Cassandra, Mongo, Redis and Druid - with no success.
Join me on our journey and I will show you the current solution that implements real-time aggregation over MemSQL integrated with the batch processing over Apache Spark. The new architecture solved not only our pains but allowed us to aggregate X10 amount of data with much faster response times, keep up with product demands and it was a cheaper solution from the production cost perspective.

22d9eb22713520cfb9df28f6b1ce7f83?s=128

AppsFlyer

September 22, 2016
Tweet

Transcript

  1. Journey to the Real-Time Analytics in Extreme Growth yulia@appsflyer.com

  2. Real Time Dashboard • User acquisition • 8B events daily

  3. Data is Mutable

  4. Previous solution - Toku (Mongo) KAFKA Toku writers Toku master

    Toku slaves Dashboard
  5. Toku Problems • Failures on weekly basis • Bad modeling

    • No recovery
  6. Requirements • RealTime • More events (more data) • More

    dimensions (MUCH MORE DATA !!!) • Stability • Faster
  7. Dashboard - DB abstraction level KAFKA Toku writers Toku master

    Toku slaves Dashboard Middleware (Vishnu)
  8. We tried...

  9. None
  10. https://www.meetup.com/Druid-Israel/events/232075974/

  11. What did we gain? • Flexible middleware • Batch daily

    process - first step to recovery • Developers Paradise
  12. Down to Earth

  13. MemSQL In Memory Scalable DB

  14. Current Solution - MemSQL

  15. MemSQL Architecture KAFKA MemSQL writers Memsql Cluster Dashboard Middleware (Vishnu)

    MemSQL writers Memsql Cluster (Slave)
  16. Recovery KAFKA (24h) MemSQL writers Master Memsql Cluster Dashboard Middleware

    (Vishnu) Yesterday snapshot Recovery Memsql Cluster MemSQL writers - only current day
  17. Mem SQL - Quick Win • Fast • Recoverable •

    Possibility to return to 0 point • Ability to add new features • More Data (X30)
  18. Show me the numbers • Data - 100 GB x

    2 clusters • Query Latency - 1-3 seconds • Machines x 2 clusters – 2 aggregators - m4.4xlarge – 4 leaves - r3.4xlarge • Cost reduction $20K less than toku monthly
  19. Good Enough Approach • More data - more money •

    Less money - less data
  20. Current - Architecture KAFKA writers - only new data Memsql

    Rowstore Cluster 1-2 weeks Dashboard Middleware (Vishnu) Daily Batch process S3 files Memsql Columnstore History Cluster Daily
  21. “Premature optimization is a root of all evil” Donald Knuth

  22. Yulia@appsflyer.com

  23. appsflyer.com/jobs

  24. http://www.shutterstock.com/pic.mhtml?utm_campaign=ClipartLogo&irgwc=1&tpl=46764-50655&id=154723511&language=en&utm_medi um=Affiliate&utm_source=46764 http://www.samatters.com/wp-content/uploads/2015/07/round-peg.jpg http://marsmedia.info/en/blog/cassandra.png http://www.zdnet.de/wp-content/uploads/2013/10/mongodb-logo.jpg https://chris.lu/upload/images/redis.png https://upload.wikimedia.org/wikipedia/en/b/ba/Druid_MasterLogo_Full_Color_Small.png https://www.leftronic.com/wp-content/uploads/2015/04/Amazonredshift_220x110.png