Ivory - Data Modelling

6715c59578f5761d363cbfdc63d8f889?s=47 Ambiata
October 20, 2014

Ivory - Data Modelling

6715c59578f5761d363cbfdc63d8f889?s=128

Ambiata

October 20, 2014
Tweet

Transcript

  1. IVORY DATA MODELLING http://github.com/ambiata/ivory © Ambiata 2014

  2. WHAT WE START WITH © Ambiata 2014

  3. © Ambiata 2014

  4. WHAT WE NEED © Ambiata 2014

  5. Feature vectors © Ambiata 2014 0.00 3 3001 1.00 634.83

    16 4670 0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - gender balance purchases zipcode prop_online num_accs 89340218 feature instance 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236
  6. Ivory Repository Ingest facts Extract features © Ambiata 2014

  7. © Ambiata 2014 Fact ETL Source data Entity resolution +

    attribution Factset Ivory Repository Ingest facts Extract features
  8. WHAT’S A FACT? © Ambiata 2014

  9. WHAT’S A FEATURE? © Ambiata 2014

  10. FACT • Atomic piece of information attributed to an entity

    • 2 types: states and events • Captured as close to the “source” as possible © Ambiata 2014
  11. • State facts • Demographics, e.g.: gender, DOB, zipcode, etc

    • Account statuses • Subscription states • Snapshots, e.g. account balance at end of month • Segments © Ambiata 2014
  12. • Event facts • Purchases • Page views • Phone

    calls • Queries © Ambiata 2014
  13. FEATURE • Attribute that describes one aspect of an entity

    • Derived from facts • Simplest feature is “latest value before ‘date’” © Ambiata 2014
  14. • Latest • Days since latest, days since earliest •

    Count, sum • Mean, quantile, proportion • Gradient, state changes © Ambiata 2014