Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Behavioral Databases

benbjohnson
October 24, 2012

Behavioral Databases

A presentation I gave at the Derailed meetup in Denver on October 24th, 2012.

benbjohnson

October 24, 2012
Tweet

More Decks by benbjohnson

Other Decks in Technology

Transcript

  1. Behavioral Databases
    Next Generation NoSQL Analytics
    By Ben Johnson

    View Slide

  2. My Background

    View Slide

  3. Former Oracle DBA

    View Slide

  4. Former Oracle DBA
    Behavioral Analytics

    View Slide

  5. Former Oracle DBA
    Data Visualization
    Behavioral Analytics

    View Slide

  6. Why did I write a
    database?

    View Slide

  7. Other Databases Are Too Slow
    Redis
    100K ops/second
    PostgreSQL
    5K QPS/core

    View Slide

  8. Processing Needed to Occur Near Data
    Client Server
    1ms
    1ms

    View Slide

  9. Processing Needed to Occur Near Data
    Client Server
    1ms
    1ms
    1ms
    1ms

    View Slide

  10. Processing Needed to Occur Near Data
    Client Server
    1ms
    1ms
    1ms
    1ms
    1ms
    1ms

    View Slide

  11. Processing Needed to Occur Near Data
    Client Server
    1ms
    1ms
    1ms
    1ms
    1ms
    1ms
    1ms
    1ms

    View Slide

  12. Processing Needed to Occur Near Data

    View Slide

  13. Memory is getting cheap. Use it.
    2010
    $20/GB
    http://www.jcmit.com/memoryprice.htm

    View Slide

  14. Memory is getting cheap. Use it.
    2011
    $10/GB
    2010
    $20/GB
    http://www.jcmit.com/memoryprice.htm

    View Slide

  15. Memory is getting cheap. Use it.
    2012
    $5/GB
    2011
    $10/GB
    2010
    $20/GB
    http://www.jcmit.com/memoryprice.htm

    View Slide

  16. Traditional Databases Have a Lot
    Of Features I Don’t Need

    View Slide

  17. Locks & Latches

    View Slide

  18. Transactions

    View Slide

  19. Traditional Databases are Limited to
    Simple Data Access
    Key/Value Tabular

    View Slide

  20. Traditional Databases are Limited to
    Simple Data Access
    Key/Value Tabular
    (Boring)

    View Slide

  21. Traditional Databases Are Not Real-Time

    View Slide

  22. Basics of Behavioral Data

    View Slide

  23. Actions

    View Slide

  24. Actions & State

    View Slide

  25. Actions & State + Time

    View Slide

  26. Clickstream
    Logs
    Sensor Data
    Financial
    Transactions
    IVR

    View Slide

  27. Important Differences In
    Behavioral Data

    View Slide

  28. Behavioral Data
    is Historical

    View Slide

  29. Behavioral Data is Isolated

    View Slide

  30. Isolated = Concurrency

    View Slide

  31. Performance / Internals

    View Slide

  32. Aggregates
    100M events / sec / core

    View Slide

  33. Optimizations
    Stored by object then time
    Memory mapped
    Easy, compact data format

    View Slide

  34. Simplicity
    Supports Int64, Double, String & Bool
    MessagePack Encoded
    C99, No Dependencies

    View Slide

  35. Writing Your Own Language
    (Tangent)

    View Slide

  36. EQL
    (Event Query Language)

    View Slide

  37. EQL
    (Event Query Language)

    View Slide

  38. Qip

    View Slide

  39. Qip
    (Doesn’t stand for anything)

    View Slide

  40. What is Qip?
    * LLVM-backed query processing language.
    * JIT compiled on the fly.
    * As fast as C.
    * Removed from build because of complexity.

    View Slide

  41. GitHub Archive Visualizer
    (Demo)

    View Slide

  42. Ruby to Sky
    Integration

    View Slide

  43. Add Event
    SkyDB.add_event(
    new Event(
    object_id:1,
    timestamp: Time.now,
    action: “/sign_up”,
    data: {
    name: “John”,
    age: 20
    }
    )
    )

    View Slide

  44. Next Actions
    SkyDB.next_actions(
    [
    “/”,
    “/sign_up”,
    “/checkout”
    ]
    )

    View Slide

  45. What’s Next
    For Sky?

    View Slide

  46. More Analytics
    Functions!

    View Slide

  47. Cohort Analysis
    1 2 3 4 5
    Jan 80% 70% 65% 63% 62%
    Feb 83% 73% 70% 69%
    Mar 87% 78% 75%
    Apr 89% 80%
    May 90%
    Month Signed Up
    Months After Signing Up

    View Slide

  48. DAGs
    Home Page
    Sign Up
    Checkout
    View
    Product
    Cancel
    Order

    View Slide

  49. Awesome
    Open Source
    Analytics Tools
    It’s like MixPanel that you can install!

    View Slide

  50. Modules!

    View Slide

  51. Predictive
    Behavioral Analytics
    What will your users do next?

    View Slide

  52. Risk Analysis

    View Slide

  53. Anomaly Dectection /
    Fraud Detection

    View Slide

  54. Questions?

    View Slide

  55. Contact Info
    @benbjohnson
    [email protected]

    View Slide