Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
IVORY DATA MODELLING http://github.com/ambiata/ivory © Ambiata 2014
Slide 2
Slide 2 text
WHAT WE START WITH © Ambiata 2014
Slide 3
Slide 3 text
© Ambiata 2014
Slide 4
Slide 4 text
WHAT WE NEED © Ambiata 2014
Slide 5
Slide 5 text
Feature vectors © Ambiata 2014 0.00 3 3001 1.00 634.83 16 4670 0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - gender balance purchases zipcode prop_online num_accs 89340218 feature instance 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236
Slide 6
Slide 6 text
Ivory Repository Ingest facts Extract features © Ambiata 2014
Slide 7
Slide 7 text
© Ambiata 2014 Fact ETL Source data Entity resolution + attribution Factset Ivory Repository Ingest facts Extract features
Slide 8
Slide 8 text
WHAT’S A FACT? © Ambiata 2014
Slide 9
Slide 9 text
WHAT’S A FEATURE? © Ambiata 2014
Slide 10
Slide 10 text
FACT • Atomic piece of information attributed to an entity • 2 types: states and events • Captured as close to the “source” as possible © Ambiata 2014
Slide 11
Slide 11 text
• State facts • Demographics, e.g.: gender, DOB, zipcode, etc • Account statuses • Subscription states • Snapshots, e.g. account balance at end of month • Segments © Ambiata 2014
Slide 12
Slide 12 text
• Event facts • Purchases • Page views • Phone calls • Queries © Ambiata 2014
Slide 13
Slide 13 text
FEATURE • Attribute that describes one aspect of an entity • Derived from facts • Simplest feature is “latest value before ‘date’” © Ambiata 2014
Slide 14
Slide 14 text
• Latest • Days since latest, days since earliest • Count, sum • Mean, quantile, proportion • Gradient, state changes © Ambiata 2014