Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Group meeting: TaintPipe - Pipelined Symbolic Taint Analysis

Yu-Hsin Hung
November 04, 2016

Group meeting: TaintPipe - Pipelined Symbolic Taint Analysis

Paper (USENIX Security '15): https://www.usenix.org/node/190954

Yu-Hsin Hung

November 04, 2016
Tweet

More Decks by Yu-Hsin Hung

Other Decks in Research

Transcript

  1. TaintPipe: Pipelined Symbolic Taint Analysis Jiang Ming, Dinghao Wu, Gaoyao

    Xiao, Jun Wang, and Peng Liu The Pennsylvania State University USENIX Security ‘15
  2. Outline • Introduction • Background • Design and Implementation •

    Experimental Evaluation • Discussions and Limitations
  3. Taint analysis • Basic idea: keep track of tags derived

    from user input Taint seeds Taint propagation Taint sinks
  4. Taint analysis • Static taint analysis (STA) • performed prior

    to execution • no impact on runtime performance • under-tainting or over-tainting • Dynamic taint analysis (DTA) • more accurate • high runtime overhead (Pin: > 6X slowdown)
  5. Problems of DTA • high runtime overhead (6X~30X) • 6~8

    extra instructions to propagate a taint tag in shadow memory • strict coupling of program execution and data flow tracking logic • frequent “context-switches” • register spilling • data cache pollution
  6. Previous work • Hardware-assisted approach • customized hardware for logging

    a program trace and delivering it to other idle cores for inspection • hardware first-in first-out buffer for speeding up communication between cores • Software-only methods • rely on dynamic binary instrumentation (DBI) • decouple dynamic taint analysis from program execution • ShadowReplica: “primary & secondary” thread model
  7. TaintPipe • parallel data flow tracking using pipelined symbolic taint

    analysis • segmented symbolic taint analysis • symbolic taint state resolution
  8. Inlined Analysis vs. TaintPipe • instrumented application thread: lightweight online

    logging • multiple worker threads: symbolic taint analysis
  9. Inlined Analysis vs. TaintPipe • (a) code segment • (b)

    symbolic taint states, the input value size and num are labeled as symbol1 and symbol2 • (c) resolving symbolic taint states when size is tainted as tag1 and num is a constant value (num = 0xffffffff)
  10. TaintPipe • record compact control flow information to reconstruct straight-line

    code • targets of direct and indirect jumps have been resolved • most addresses of memory operations can be inferred from the straight-line code • pipelining design (asynchronous) • may detect an attack some time after the real attack has happened
  11. Architecture • built on top of a dynamic binary instrumentation

    tool • work with unmodified program binaries
  12. Implementation • online logging and pipelining framework • dynamic binary

    instrumentation framework Pin • 3,100 lines of C/C++ code • taint analysis engine • binary analysis platform BAP • based on BAP’s symbolic execution module • 4,400 lines of OCaml code
  13. Logging • Logged data • control flow profile (in compact

    format) • concrete execution state when taint seeds are first introduced, including registers and memory (e.g., CR0~CR4, EFLAGS and addresses of initial taint seeds)
  14. Lightweight Online Logging • control flow profile: sequence of basic

    blocks executed • Detailed Execution Profile (DEP) • 2-byte profile structure to represent 4-byte basic block address on x86-32 machine • H-tag, L-tag, and special tag (0x0000) • optimization for REP-prefix instructions
  15. Optimal configuration • Determine the optimal values for two factors

    • control flow profile buffer size • number of worker threads • Experiment setup • 2x Intel Xeon E5-2690 • 128GB RAM • 250GB SSD • Ubuntu 12.04
  16. Performance (SPEC CPU2006) On average, the instrumented application thread enforces

    a 2.60X slowdown to native execution, while the overall slowdown of TaintPipe is 4.14X.
  17. Effects of Optimizations • O1: function summary • O2: O1

    + taint basic block cache • O3: O2 + intra-block optimization
  18. Discussions and Limitations • asynchronous taint check • provide synchronous

    policy enforcement at critical points • malicious self-modifying code