Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Improving feature engineering in the lab and pr...

Improving feature engineering in the lab and production with Ivory

Talk given by Ben Lever at Strata+Hadoop World in London, 6 May 2015:

Feature engineering is a critical and time-consuming activity in the development and deployment of any modeling pipeline. It is also exacerbated as data science teams seek to incorporate new data sources into their pipelines that are at a scale far larger than previously employed. Furthermore, the transition to production environments is littered with complexity as these pipelines are exposed to the dynamic, and fragile, world of ongoing data feeds, data corrections, and evolving data models.

In this talk we will introduce Ivory, a new open-source, Hadoop-based data store that seeks to address these challenges. Ivory is a scalable and extensible data store for storing facts and extracting features. It is optimised specifically for the feature engineering stages of modelling pipelines, simultaneously simplifying and adding rigour to them.

This session will walk through an example of how Ivory can be used in the typical data scientist’s workflow, and then how that extends to migrating pipelines into production. It will impart all of the basic concepts of Ivory such as repositories, the dictionary, its fact-based data model, and virtual features. It will also demonstrate the benefits of Ivory being an immutable data store and the unique opportunities that creates.

Ambiata

May 06, 2015
Tweet

More Decks by Ambiata

Other Decks in Technology

Transcript

  1. © Ambiata 2015 0.00 3 “3001” 451.20 634.83 16 “4670”

    128.22 15.12 2 - 15.45 33.56 2 - 17.12 98.34 12 “3303” 328.34 523.81 23 “2046” 63.98 1086.05 17 - 71.59 224.81 9 - 1042.43 78.21 2 “2134” 27.65 126.48 4 - 135.20 true false false false true true false false false false “M” - “F” “M” “F” - “F” “F” “M” - gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236
  2. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” -
  3. 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20

    true false false false true true false false false false max.spend is.new © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” - INSTANCE
  4. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” - FEATURE VECTOR
  5. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” -
  6. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new FEATURE “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  7. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” -
  8. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” -
  9. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  10. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” -
  11. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “M” - “F” “M” “F” - “F” “F” “M” - “3001” “4670” - - “3303” “2046” - - “2134” -
  12. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 M - F M F - F F M - gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” -
  13. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” -
  14. © Ambiata 2015 Subscription Trial Paid Cancelled Transactions Support queries

    CONTINUAL DATA GROWTH CONSTANT TIME FEATURE PREPARATION
  15. © Ambiata 2015 Lab Once-off data set If it’s wrong,

    start again Factory Ongoing data feeds Fix broken data on the fly
  16. © Ambiata 2015 Lab Once-off data set If it’s wrong,

    start again Variable time Factory Ongoing data feeds Fix broken data on the fly Constant time
  17. © Ambiata 2015 Receive data every day Batch score models

    every day Prepare features every day
  18. © Ambiata 2015 Receive data every day Batch score models

    every day Prepare features every day x N
  19. © Ambiata 2015 v Values are structured: primitives structs list

    of values Facts: immutable, typed values keyed along 3 dimensions
  20. © Ambiata 2015 entity v Like the primary key of

    a DB row: customer ID account ID user ID Facts: immutable, typed values keyed along 3 dimensions
  21. © Ambiata 2015 entity attribute v Like the name of

    a DB column: gender purchases zipcode balance Facts: immutable, typed values keyed along 3 dimensions
  22. © Ambiata 2015 entity attribute time v When the value

    is valid from Facts: immutable, typed values keyed along 3 dimensions
  23. © Ambiata 2015 entity attribute time v E-A-V-T Dimensions are

    unbounded Unordered Facts: immutable, typed values keyed along 3 dimensions
  24. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T
  25. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T
  26. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F”
  27. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M”
  28. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M”
  29. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M”
  30. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192”
  31. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192”
  32. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23
  33. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21
  34. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21
  35. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true
  36. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 balance 2014-02-04 46.54 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true
  37. © Ambiata 2015 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01

    “2134” 48149407 purchases 2014-02-04 4 E A V T 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true
  38. © Ambiata 2015 gender balance purchases zipcode max.spend is.new 89340218

    48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -
  39. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - max.spend is.new
  40. © Ambiata 2015 0.00 3 3001 1.00 634.83 16 4670

    0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -
  41. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 93754723 00272446 13374497 31989993 46474236 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - max.spend is.new
  42. © Ambiata 2015 gender balance purchases zipcode entity max.spend is.new

    0.00 3 3001 1.00 634.83 16 4670 0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -
  43. © Ambiata 2015 gender balance purchases zipcode max.spend is.new entity

    @ time 0.00 3 3001 1.00 634.83 16 4670 0.6875 15.12 2 - 0.50 33.56 2 - 1.00 98.34 12 3303 0.8333 523.81 23 2046 0.4782 1086.05 17 - 1.00 224.81 9 - 0.2222 78.21 2 2134 0.50 126.48 4 - 0.0 1 3 1 1 4 1 2 1 1 1 M - F M F - F F M - 0.00 3 634.83 16 15.12 2 33.56 2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 “M” - “F” “M” “F” - “F” “F” “M” - 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” -
  44. © Ambiata 2015 gender balance purchases zipcode max.spend is.new 89340218

    48149407 18452274 07499337 62948721 93754723 00272446 13374497 31989993 46474236
  45. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 93754723 00272446 13374497 31989993 46474236 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  46. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 93754723 00272446 13374497 31989993 46474236 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 2015-04-03 @ @ @ @ @ @ @ @ @ @ Same times max.spend is.new
  47. © Ambiata 2015 Snapshot All entities Same time for all

    entities Features for model scoring
  48. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 max.spend is.new
  49. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  50. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Different times max.spend is.new
  51. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Same entity, different times max.spend is.new
  52. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Same entity, different times max.spend is.new
  53. © Ambiata 2015 Snapshot Chord All entities Same time for

    all entities Features for model scoring
  54. © Ambiata 2015 Snapshot Chord All entities Specific entities Same

    time for all entities Features for model scoring
  55. © Ambiata 2015 Snapshot Chord All entities Specific entities Same

    time for all entities Different times for different entities Features for model scoring
  56. © Ambiata 2015 Snapshot Chord All entities Specific entities Same

    time for all entities Different times for different entities Features for model scoring Features for model training
  57. © Ambiata 2015 Snapshot Chord All entities Specific entities Same

    time for all entities Different times for different entities Features for model scoring Features for model training
  58. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  59. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  60. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  61. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  62. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  63. 48149407 gender 2007-04-01 “F” 48149407 zipcode 2011-04-01 “2134” 48149407 purchases

    2014-02-04 4 62948721 gender 2013-03-14 “F” 93754723 gender 2013-09-27 “M” 48149407 zipcode 2015-03-19 “5192” 62948721 balance 2014-02-01 3478.23 62948721 balance 2013-02-01 12099.21 48149407 has.children 2015-02-17 true © Ambiata 2015 E A V T
  64. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  65. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  66. © Ambiata 2015 “3001” gender balance purchases zipcode 89340218 48149407

    18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  67. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new “3001”
  68. © Ambiata 2015 “3001” gender balance purchases zipcode 89340218 48149407

    18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 0.00 “M” max.spend is.new
  69. © Ambiata 2015 0.00 “3001” 634.83 “4670” 15.12 33.56 98.34

    “3303” 523.81 “2046” 1086.05 224.81 78.21 “2134” 126.48 “M” “F” “M” “F” “F” “F” “M” gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new
  70. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  71. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  72. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Base attributes max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  73. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Base features max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  74. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  75. © Ambiata 2015 gender balance purchases zipcode 89340218 48149407 18452274

    07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ max.spend is.new 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  76. © Ambiata 2015 gender balance purchases zipcode max.spend is.new 89340218

    48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 3 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” -
  77. © Ambiata 2015 gender balance purchases zipcode max.spend is.new 89340218

    48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 3 0.00 “3001” 634.83 “4670” 15.12 - 33.56 - 98.34 “3303” 523.81 “2046” 1086.05 - 224.81 - 78.21 “2134” 126.48 - “M” - “F” “M” “F” - “F” “F” “M” - 451.20 true
  78. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  79. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false max.spend is.new “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  80. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Virtual attributes 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  81. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ Virtual features 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  82. © Ambiata 2015 0.00 3 634.83 16 15.12 2 33.56

    2 98.34 12 523.81 23 1086.05 17 224.81 9 78.21 2 126.48 4 gender balance purchases zipcode max.spend is.new 89340218 48149407 18452274 07499337 62948721 48149407 00272446 13374497 62948721 46474236 2015-04-03 2014-02-01 2013-04-03 2015-01-13 2013-10-26 2014-12-14 2014-11-22 2013-09-17 2014-10-30 2015-08-27 @ @ @ @ @ @ @ @ @ @ 451.20 128.22 15.45 17.12 328.34 63.98 71.59 1042.43 27.65 135.20 true false false false true true false false false false “3001” “4670” - - “3303” “2046” - - “2134” - “M” - “F” “M” “F” - “F” “F” “M” -
  83. © Ambiata 2015 gender balance zipcode purchases max.spend is.new name

    string double string encoding transaction {item:string, amnt:double}
  84. © Ambiata 2015 gender balance zipcode purchases max.spend is.new name

    string double string encoding transaction {item:string, amnt:double} transaction 4w count source window expression
  85. © Ambiata 2015 gender balance zipcode purchases max.spend is.new name

    string double string encoding transaction {item:string, amnt:double} transaction 4w count transaction 4w max(amnt) source window expression
  86. © Ambiata 2015 gender balance zipcode purchases max.spend is.new name

    string double string encoding transaction {item:string, amnt:double} transaction 4w count transaction 4w max(amnt) source window expression support_query {type:string, rating:int}
  87. © Ambiata 2015 gender balance zipcode purchases max.spend is.new name

    string double string encoding transaction {item:string, amnt:double} transaction 4w count transaction 4w max(amnt) source window expression support_query {type:string, rating:int} query 12w count > 0
  88. © Ambiata 2015 > ivory create-repository /my-repo > ivory import-dictionary

    ... > ivory ingest ... > ivory snapshot ... > ivory chord ...
  89. © Ambiata 2015 1 2 3 0 4 5 6

    7 8 9 10 11 12 13 14 15
  90. © Ambiata 2015 1 2 3 0 4 5 6

    7 8 9 10 11 12 13 14 15 snapshot
  91. © Ambiata 2015 1 2 3 0 4 5 6

    7 8 9 10 11 12 13 14 15 chord
  92. © Ambiata 2015 1 2 3 0 4 5 6

    7 8 9 10 11 12 13 14 15
  93. © Ambiata 2015 Ivory in production for ~18 months Ambiata

    hosted repositories (AWS) Self-hosted repositories (on-prem)