1: limit your problem space don’t try to “detect anomalies” instead “detect anomalies in disk usage” (or something equally specific) Step 2: models that apply to little data don’t apply to big data (and vice versa) Never forget to consider characteristics of data. Monday, September 23, 13