Improving feature engineering in the lab and production with Ivory

IVORY © Ambiata 2015

Ben Lever CTO & co-founder at Ambiata © Ambiata 2015
@bmlever

IMPROVING FEATURE ENGINEERING © Ambiata 2015

BUILDING & DEPLOYING MODELS © Ambiata 2015

© Ambiata 2015 0.00 3 “3001” 451.20 634.83 16 “4670”
128.22 15.12 2 - 15.45 33.56 2 - 17.12 98.34 12 “3303” 328.34 523.81 23 “2046” 63.98 1086.05 17 - 71.59 224.81 9 - 1042.43 78.21 2 “2134” 27.65 126.48 4 - 135.20 true false false false true true false false false false “M” - “F” “M” “F” - “F” “F” “M” - gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” -

451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20
true false false false true true false false false false max.spend is.new © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” - INSTANCE

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” - FEATURE VECTOR

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new FEATURE “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 M - F M F - F F M - gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” -

2 w

2014-01-23

© Ambiata 2015 Subscription Trial Paid Cancelled Support queries 2014-01-23
Transactions

Transactions

CONTINUAL DATA GROWTH

CONTINUAL DATA GROWTH CONSTANT TIME FEATURE PREPARATION

© Ambiata 2015 Lab Once-off data set Factory Ongoing data
feeds

© Ambiata 2015 Lab Once-off data set If it’s wrong,
start again Factory Ongoing data feeds Fix broken data on the ﬂy

© Ambiata 2015 Lab Once-off data set If it’s wrong,
start again Variable time Factory Ongoing data feeds Fix broken data on the ﬂy Constant time

© Ambiata 2015 Feature preparation Model training Feature preparation Model
training

© Ambiata 2015 Feature preparation Model training Feature preparation Model
training Feature preparation Model training

Feature preparation Modeling 85% 15% © Ambiata 2015 … optimistically

© Ambiata 2015 Receive data every day Prepare features every
day

© Ambiata 2015 Receive data every day Batch score models
every day Prepare features every day

© Ambiata 2015 Receive data every day Batch score models
every day Prepare features every day x N

IVORY © Ambiata 2015

IVORY © Ambiata 2015 A Data Warehouse for Data Science

IVORY © Ambiata 2015 Apache V2 Licence

A scalable and extensible data store for storing facts and
extracting features © Ambiata 2015

Facts: immutable, typed values keyed along 3 dimensions © Ambiata
2015

© Ambiata 2015 v Facts: immutable, typed values keyed along
3 dimensions

© Ambiata 2015 v Values are structured: primitives structs list
of values Facts: immutable, typed values keyed along 3 dimensions

© Ambiata 2015 entity v Like the primary key of
a DB row: customer ID account ID user ID Facts: immutable, typed values keyed along 3 dimensions

© Ambiata 2015 entity attribute v Like the name of
a DB column: gender purchases zipcode balance Facts: immutable, typed values keyed along 3 dimensions

© Ambiata 2015 entity attribute time v When the value
is valid from Facts: immutable, typed values keyed along 3 dimensions

© Ambiata 2015 entity attribute time v Facts: immutable, typed
values keyed along 3 dimensions

© Ambiata 2015 entity attribute time v E-A-V-T Dimensions are
unbounded Unordered Facts: immutable, typed values keyed along 3 dimensions

© Ambiata 2015 48149407 gender 2007-04-01 “F” E A V
T

© Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01
“2134” E A V T

“2134” 48149407 balance 2014-02-04 46.54 E A V T

“2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F”

“2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M”

“2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192”

“2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23

“2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21

“2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true

A scalable and extensible data store for storing facts and
extracting features © Ambiata 2015

“2134” 48149407 purchases 2014-02-04 4 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true

© Ambiata 2015 gender balance purchases zipcode max.spend is.new 89340218
48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -

© Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274
07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - max.spend is.new

© Ambiata 2015 0.00 3 3001 1.00 634.83 16 4670
0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -

07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - max.spend is.new

© Ambiata 2015 gender balance purchases zipcode entity max.spend is.new
0.00 3 3001 1.00 634.83 16 4670 0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -

© Ambiata 2015 gender balance purchases zipcode max.spend is.new entity
@ time 0.00 3 3001 1.00 634.83 16 4670 0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -

Transactions

48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236

07499337 62948721 93754723 00272446 13374497 31989993 46474236 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 @ @ @ @ @ @ @ @ @ @ max.spend is.new

07499337 62948721 93754723 00272446 13374497 31989993 46474236 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 @ @ @ @ @ @ @ @ @ @ Same times max.spend is.new

© Ambiata 2015 Snapshot All entities Same time for all
entities

© Ambiata 2015 Snapshot All entities Same time for all
entities Features for model scoring

© Ambiata 2015 Trial Paid Cancelled 89340218 48149407 Trial Paid
2014-05-17

© Ambiata 2015 Trial Paid Cancelled 89340218 48149407 18452274 Trial
Paid Paid 2014-05-17

© Ambiata 2015 Trial Paid Cancelled 89340218 48149407 18452274 Trial
Paid Paid 2014-05-17 2014-01-23 2014-09-24

07499337 62948721 48149407 00272446 13374497 62948721 46474236 max.spend is.new

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Different times max.spend is.new

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Same entity, different times max.spend is.new

© Ambiata 2015 Snapshot Chord All entities Same time for
all entities Features for model scoring

© Ambiata 2015 Snapshot Chord All entities Speciﬁc entities Same
time for all entities Features for model scoring

time for all entities Different times for different entities Features for model scoring

time for all entities Different times for different entities Features for model scoring Features for model training

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new

48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01 “2134” 48149407 purchases
2014-02-04 4 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true © Ambiata 2015 E A V T

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new

© Ambiata 2015 “3001” gender balance purchases zipcode 89340218 48149407
18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new “3001”

© Ambiata 2015 “3001” gender balance purchases zipcode 89340218 48149407
18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 0.00 “M” max.spend is.new

© Ambiata 2015 0.00 “3001” 634.83 “4670” 15.12 33.56 98.34
“3303” 523.81 “2046” 1086.05 224.81 78.21 “2134” 126.48 “M” “F” “M” “F” “F” “F” “M” gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Base attributes max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Base features max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -

07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -

© Ambiata 2015 zipcode “3001” “2046” “2045” transaction 2013-04-03 count
2w

© Ambiata 2015 zipcode “3001” “2046” “2045” transaction 2013-04-03 count
4w

© Ambiata 2015 zipcode “3001” “2046” “2045” transaction 2013-04-03 max
4w

48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 3 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -

48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 3 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” - 451.20 true

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Virtual attributes 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Virtual features 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -

© Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56
2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -

string double string encoding

string double string encoding transaction {item:string, amnt:double}

string double string encoding transaction {item:string, amnt:double} transaction 4w count source window expression

string double string encoding transaction {item:string, amnt:double} transaction 4w count transaction 4w max(amnt) source window expression

string double string encoding transaction {item:string, amnt:double} transaction 4w count transaction 4w max(amnt) source window expression support_query {type:string, rating:int}

string double string encoding transaction {item:string, amnt:double} transaction 4w count transaction 4w max(amnt) source window expression support_query {type:string, rating:int} query 12w count > 0

Snapshot

Chord

... > ivory ingest ...

... > ivory ingest ... > ivory snapshot ...

... > ivory ingest ... > ivory snapshot ... > ivory chord ...

Improving feature engineering in the lab and pr...

Improving feature engineering in the lab and production with Ivory

More Decks by Ambiata

Other Decks in Technology

Featured

Transcript