Slide 37
Slide 37 text
Feature Extraction
List of package ids (order by timestamp)
mid: [10, 4465, 1025960, 5413, 1456, 1299646, ...]
⁃ 10: [0.1, 0.5, 0.2, 0.2, 0.35, ...]
⁃ 4465: [0.9, 0.8, 0.4, 0.1, 0.2, ...]
⁃ 1025960: [0.45, 0.2, 0.2, 0.6, 0.9, ...]
⁃ 5413: [0.8, 0.8,0.1, 0.2, 0.5, ...]
⁃ 1456: [0.7, 0.3, 0.7, 0.3, 0.2, ...]
⁃ 10: [0.0, 0.0, 0.0, ...., 0.3, 0.1, 0.9, .., 0.0, 0.0, 0.0]
⁃ 4465: [0.3, 0.6, 0.1, …., 0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.
⁃ 1025960: [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0, ..., 0.7, 0.3, 0.1]
⁃ 5413: [0.9,0.1,0.1,....,0.0, 0.0,0.0, ..., 0.0, 0.0, 0.0]
⁃ 1456: [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0, ..., 0.2, 0.2, 0.6
mid: [0.1, 0.2, 0.1, … 0.3, 0.1, 0.8, …, 0.1, 0.6, 0.4]
mid: [0.5, 0.3, 0.2, 0.9, 0.8]
fastText
GMM
Accumulate
dimensionality reduction (PCA with matrix sketching)
y-sparse-features
(typically 6,000 dimension)
y-dense-features
(typically 400 dimension)