Behavioral Databases

6c76488dff9b5d9a872dff88f008f88e?s=47 benbjohnson
October 24, 2012

Behavioral Databases

A presentation I gave at the Derailed meetup in Denver on October 24th, 2012.

6c76488dff9b5d9a872dff88f008f88e?s=128

benbjohnson

October 24, 2012
Tweet

Transcript

  1. Behavioral Databases Next Generation NoSQL Analytics By Ben Johnson

  2. My Background

  3. Former Oracle DBA

  4. Former Oracle DBA Behavioral Analytics

  5. Former Oracle DBA Data Visualization Behavioral Analytics

  6. Why did I write a database?

  7. Other Databases Are Too Slow Redis 100K ops/second PostgreSQL 5K

    QPS/core
  8. Processing Needed to Occur Near Data Client Server 1ms 1ms

  9. Processing Needed to Occur Near Data Client Server 1ms 1ms

    1ms 1ms
  10. Processing Needed to Occur Near Data Client Server 1ms 1ms

    1ms 1ms 1ms 1ms
  11. Processing Needed to Occur Near Data Client Server 1ms 1ms

    1ms 1ms 1ms 1ms 1ms 1ms
  12. Processing Needed to Occur Near Data

  13. Memory is getting cheap. Use it. 2010 $20/GB http://www.jcmit.com/memoryprice.htm

  14. Memory is getting cheap. Use it. 2011 $10/GB 2010 $20/GB

    http://www.jcmit.com/memoryprice.htm
  15. Memory is getting cheap. Use it. 2012 $5/GB 2011 $10/GB

    2010 $20/GB http://www.jcmit.com/memoryprice.htm
  16. Traditional Databases Have a Lot Of Features I Don’t Need

  17. Locks & Latches

  18. Transactions

  19. Traditional Databases are Limited to Simple Data Access Key/Value Tabular

  20. Traditional Databases are Limited to Simple Data Access Key/Value Tabular

    (Boring)
  21. Traditional Databases Are Not Real-Time

  22. Basics of Behavioral Data

  23. Actions

  24. Actions & State

  25. Actions & State + Time

  26. Clickstream Logs Sensor Data Financial Transactions IVR

  27. Important Differences In Behavioral Data

  28. Behavioral Data is Historical

  29. Behavioral Data is Isolated

  30. Isolated = Concurrency

  31. Performance / Internals

  32. Aggregates 100M events / sec / core

  33. Optimizations Stored by object then time Memory mapped Easy, compact

    data format
  34. Simplicity Supports Int64, Double, String & Bool MessagePack Encoded C99,

    No Dependencies
  35. Writing Your Own Language (Tangent)

  36. EQL (Event Query Language)

  37. EQL (Event Query Language)

  38. Qip

  39. Qip (Doesn’t stand for anything)

  40. What is Qip? * LLVM-backed query processing language. * JIT

    compiled on the fly. * As fast as C. * Removed from build because of complexity.
  41. GitHub Archive Visualizer (Demo)

  42. Ruby to Sky Integration

  43. Add Event SkyDB.add_event( new Event( object_id:1, timestamp: Time.now, action: “/sign_up”,

    data: { name: “John”, age: 20 } ) )
  44. Next Actions SkyDB.next_actions( [ “/”, “/sign_up”, “/checkout” ] )

  45. What’s Next For Sky?

  46. More Analytics Functions!

  47. Cohort Analysis 1 2 3 4 5 Jan 80% 70%

    65% 63% 62% Feb 83% 73% 70% 69% Mar 87% 78% 75% Apr 89% 80% May 90% Month Signed Up Months After Signing Up
  48. DAGs Home Page Sign Up Checkout View Product Cancel Order

  49. Awesome Open Source Analytics Tools It’s like MixPanel that you

    can install!
  50. Modules!

  51. Predictive Behavioral Analytics What will your users do next?

  52. Risk Analysis

  53. Anomaly Dectection / Fraud Detection

  54. Questions?

  55. Contact Info @benbjohnson benbjohnson@yahoo.com