Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for monitoring: Detecting anomalies in the monitoring system

Machine Learning for monitoring: Detecting anomalies in the monitoring system

By Shin, Jongho at BECKS#5 https://becks.kktix.cc/events/twbecks5

2102a6b8760bd6f57f672805723dd83a?s=128

LINE Developers Taiwan
PRO

December 12, 2019
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Programming

Transcript

  1. Shin, Jongho; LINE+ Graylab 2019.12.12 Machine Learning for monitoring Detecting

    anomalies in the monitoring system
  2. Contents Anomaly detection using ML ML system design Results and

    Future works Summary 01 02 03 04
  3. Anomalies are everywhere Anomaly?

  4. Not just outliers, but malicious activities Unknown unknowns Fraud detection,

    Network intrusion detection, etc
 We don’t know what it would look like until we found it. Anomaly in monitoring?
  5. ML is not good at finding subtle differences Hard to

    explain the result Adversarial ML Moreover, deep learnings are good at finding or generating similar things
 The result may not be intuitive. Signal fatigue. Poisoning attacks, evasion attacks. Challenges IUUQTXFCTUBOGPSEFEVDMBTTDTEMFDUVSFT4FTTJPOQEG
  6. Google Security Monitoring Tools Group Disclaimer: this diagram is old

    data! How are others doing? IUUQTXFCTUBOGPSEFEVDMBTTDTEMFDUVSFT4FTTJPOQEG
  7. Finding anomaly access in the critical assets - Multiple(hetero) models

    for each users - Time-series prediction on overall traffic How are we doing? Access log Logstash Elastic search User-level detection Overall detection Alarm system
  8. Clustering - Grouping the elements based on similarity - Anomaly

    elements won’t be grouped with normal elements (hopefully) Clustering
  9. Clustering We work with the security community. Hackers are always

    welcome in LINE. Clustering
  10. HDBSCAN Hierarchical DBSCAN Can handle different density groups Clustering

  11. HDBSCAN Hierarchical DBSCAN Can handle different density groups Clustering https://www.groundai.com/project/hierarchical-clustering-that-takes-advantage-of-both-density-peak-and-density-connectivity/

  12. There are some method for anomaly scoring other than clustering

    Anomaly scoring https://scikit-learn.org/stable/auto_examples/plot_anomaly_comparison.html
  13. Susto, Gian Antonio et al. “Anomaly Detection through on-line Isolation

    Forest: An application to plasma etching.” Isolation Forest - Partitioning the space until we can isolate the point - Less number of partition means more anomaly Anomaly scoring
  14. Susto, Gian Antonio et al. “Anomaly Detection through on-line Isolation

    Forest: An application to plasma etching.” Isolation Forest - Partitioning the space until we can isolate the point - Less number of partition means more anomaly Anomaly scoring
  15. Extended Isolation Forest Random slicing Anomaly scoring Hariri, Sahand, Matias

    Carrasco Kind, and Robert J. Brunner. "Extended Isolation Forest."
  16. Additive model - Trend, seasonal, and holiday - FB’s Prophet

    library Time-series prediction y(t) = g(t) + s(t) + h(t) + εt
  17. Overall filtering - Reduced more than 90% - Rough filtering

    - FP >> FN - No FP/FN ratio analysis :( Result
  18. - Deep learning - Hyperparameter optimization - Better result explanation

    - Adversarial attacks What’s next?
  19. In a nutshell, - Anomaly detection is difficult even with

    ML - Still it’s better than manual detection - There’s no silver bullet solution - Open subject - More models are robust, but hard to harmonize them Summary
  20. - https://web.stanford.edu/class/cs259d/ - Campello, Ricardo JGB, Davoud Moulavi, and Jörg

    Sander. "Density-based clustering based on hierarchical density estimates." Pacific-Asia conference on knowledge discovery and data mining. Springer, Berlin, Heidelberg, 2013. - Hariri, Sahand, Matias Carrasco Kind, and Robert J. Brunner. "Extended Isolation Forest." arXiv preprint arXiv: 1811.02141 (2018). - Taylor, Sean J., and Benjamin Letham. "Forecasting at scale." The American Statistician 72.1 (2018): 37-45. References
  21. Thank you