Slide 48
Slide 48 text
Fast Memory-efficient Anomaly Detection in Streaming Heterogenous Graphs /
StreamSpot
bit.ly/streamspot 48
Experiment Setup
Datasets
2 malicious browser-based scenarios
• Flash Player drive-by download (CVE-2015-5119)*
• JRE untrusted code execution (CVE-2012-4681)
Table 1: Dataset summary: Training scenarios and test edges (attack + 25% ben
Dataset Scenarios # Graphs Avg. |V| Avg. |E|
YDC YouTube, Download, CNN 300 8705 239648
GFC GMail, VGame, CNN 300 8151 148414
ALL YouTube, Download, CNN, GMail, VGame 500 8315 173857
(a) YDC (b) GFC (c) ALL
gure 4: Distribution of pairwise cosine distances
di↵erent values of chunk lengths.
We aim to choose a C that neither makes all pairs of
phs too similar or dissimilar. Figure 5 shows the entropy
based on which we plot the precision
curves. As a baseline, we use iFore
and 75% subsampling rate with each
a vector of 10 structural features:
degree and distinct-degree5, the av
shortest-path length, and the diamete
of nodes/edges. The curves (average
random samples) for all the datasets
Note that even with 25% of the data,
e↵ective in correctly ranking the atta
an average precision (AP, area under
then 0.9 and a near-ideal AUC (area
5 benign browser-based scenarios — 3 datasets