A Formal Framework for Program Anomaly Detection

0ae28f6d917d7810b9117e915bf3ded3?s=47 Xiaokui Shu
November 03, 2015

A Formal Framework for Program Anomaly Detection

The slides for my RAID '15 paper (same title). Recording: https://youtu.be/1SmdyImOvY4


Xiaokui Shu

November 03, 2015


  1. A Formal Framework for Program Anomaly Detection Xiaokui Shu, Danfeng

    (Daphne) Yao, and Barbara G. Ryder Department of Computer Science Virginia Tech Blacksburg, Virginia
  2. 2 Content Intrusion Detection for Programs Program Anomaly Detection Our

    Unification Framework
  3. Signature-based intrusion detection The standard defense against program attacks Zero-day

    exploits Vulnerabilities not known or fixed prior to the attack Specific Attack Signature CA-2001-26 (IE/IIS vulnerability used by Nimda Worm) GET /scripts/root.exe GET /scripts/..\xc1\x1c../winnt/system32/cmd.exe GET /scripts/..%35c../winnt/system32/cmd.exe Behavior Signature A class of JS attacks [Karanth et al. MSR 2010] unescape() replace() new_array() Stuxnet CVE-2010-2568 (Windows) CVE-2010-2729 (Windows) CVE-2010-2772 (Siemens) EMC RSA Attack CVE-2011-0609 (Flash) Operation Aurora CVE-2012-0779 (Flash) CVE-2012-1875 (IE) CVE-2012-1889 (MS XML) CVE-2012-1535 (Flash) 3
  4. 4 Program Anomaly Detection (a.k.a., host-based anomaly detection [Denning 1987])

  5. 5 … sys_ioctl() sys_open() sys_read() sys_setpgid() sys_setsid() sys_fork() … Time

    n-gram [Forrest 1996] FSA [Sekar 2001, Wagner 2001] Xj+1 Xj … Xi+1 Xi … X1 X0 Yj+1 Yj … Yi+1 Yi … Y1 Y0 PDA [Feng 2003, Feng 2004, Giffin 2004] x = 1 y = x+1 y = x*2 w = x*y Data analysis [Giffin 2006, Bhatkar 2006] Machine learning [Lee 1998, Mutz 2006, Xu 2015] Static Program Analysis Dynamic Program Analysis Hybrid detection [Gao 2004, Liu 2005] + [Wagner 2002] [Sharif 2007] [Forrest 2008] [Feng 2004] [Chandola 2009]
  6. A Uniform Understanding of Program Anomaly Detection Approaches 6 A

    Field Map for Program Anomaly Detection … sys_ioctl() sys_open() sys_read() sys_setpgid() sys_setsid() sys_fork() … Xj+1 Xj … Xi+1 Xi … X1 X0 Yj+1 Yj … Yi+1 Yi … Y1 Y0 Existing method Potential method
  7. 7 Operating System Kernel Process System calls Library calls Function

    calls Black-box Approaches White-box Approaches To decide whether the program is running normally What Does a Detection Method See?
  8. 8 Used by program anomaly detection methods Projections of Process

    Observations Instruction Trace Function Call/Ret Trace System Call Trace … push mov sub call mov sysenter ret … … call call call ret ret ret call ret … … sysenter sysenter sysenter sysenter … Precise Program Trace Practical Program Trace
  9. 9 Program Anomaly Detection: Definitions A Decision Problem: ∈ •

    : a program trace • : the set of all normal program traces A string A formal language (deterministic or stochastic) Whether the string is accepted by the language Precision Scope of The Norm What level is the projection? How descriptive is the grammar? How hard to cheat the detection? Which practical traces at the projection level are selected as practical normal traces? Program anomaly detection is A program anomaly detection approach is
  10. 10 Examples of PAD Abstractions n-gram Approach PDA Approach …

    b g g b b … 3-grams: • bgg • ggb • gbb Two rules • ggb can follow bgg • gbb can follow ggb ggb bgg gbb b b Finite State Automaton (FSA) Regular Language int sum(int n){ if(n==0){ s1 (); s2 (); return n; }else return n+sum(n-1); } sum sum … sum … main + Pushdown Automaton (PDA) = 1 2 : ≥ 1 Context-free Language
  11. 11 Detection Property: Precision The number of precise program traces

    that share the same practical program trace. Precision (Detection Capability) … a b c g e g a b b c f … … b g g b b … … m b n g s g m b b n t … Normal Anomalous This real-world PAD is lack of precision to detect the anomalous execution. More Descriptive Grammar -> Higher Precision Context-free languages are more descriptive than regular languages. Pushdown automaton approaches can better describe practical program traces than n-gram methods. PDA approaches can give more accurate detection than n-gram approaches.
  12. 12 Our Unification Framework for PAD L-1: context-sensitive language level

    L-2: context-free language level L-3: regular language level L-4: restricted regular language level Õ L-1 L-2 L-3 L-4 Path sensitivity Flow sensitivity Co-oc9 (Bach language) VPStatic4, VtPath5, Dyck6, DFAD7 Statically built FSA3 ESD8 n-gram1, dynamic DFA2 Individual event analysis Theoretical accuracy limit M 1[Forrest 1996] 2[Sekar 2001] 3[Wagner 2001] 4[Feng 2004] 5[Feng 2003] 6[Gin 2004] 7[Bhatkar 2006] 8[Gin 2006] 9[Shu CCS 2015]
  13. 13 Detection Property: Scope of The Norm 0 1 2

    g b 3 0.1 0.2 0.7 Probabilistic FSA g b g g b b FSA 1 g b g b b FSA 2 Same precision, different decision on anomaly detection. g g b b b g g g g b g g b g b g g b b b g g g g b g g b g b Λ = > Λ = The scope of the normal can be defined deterministically or probabilistically. One trace 1 2 All regular language traces One trace All regular language traces
  14. 14 Future Directions and Open Issues Õ L-1 L-2 L-3

    L-4 Path sensitivity Flow sensitivity Co-oc VPStatic, VtPath, Dyck, DFAD Statically built FSA ESD n-gram, dynamic DFA Individual event analysis Theoretical accuracy limit M Lightweight user-space tracing Training data purification Precision Practicality
  15. 15 Conclusion This work has been supported by grants ONR

    N00014-13-1-0016. A PAD approach is a formal language Uniform framework to understand PAD precision Theoretical accuracy limit proved Future directions discussed
  16. Thank you! Xiaokui is seeking a researcher/postdoc position. subx@cs.vt.edu http://xshu.net