Slide 14
Slide 14 text
Pre-processing Activity Logs
• Why?
– Activity logs are time series data
– Doesn’t make sense to use every data point as a feature
– Makes more sense to “compress” the data into a single
number
• How?
– For a given feature (e.g., audio calls), build a model of
what “normal” user activity looks like and another model
of what fraudulent activity looks like
– For each user, score them based upon which model they
are closer to
– This is called computing “log-likelihood ratios”