Ivory is a scalable and extensible data store for storing facts and extracting features. It can be used within a large machine learning pipeline for normalising data and providing feeds to model training and scoring pipelines.
Some interesting properties of Ivory are it:
- Has no moving parts - just files on disk;
- Is optimised for scans not random access;
- Is extensible along the dimension of features;
- Is scalable by using HDFS or S3 as a backing store;
- Is an immutable data store allowing version "roll backs".