Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Static Taint Analysis for JavaScript Programs

Exactpro
PRO
November 07, 2019

Static Taint Analysis for JavaScript Programs

Nabil Almashfi and Lunjin Lu

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/bouTxh76yeU

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro
PRO

November 07, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Static Taint Analysis for JavaScript Programs / 25
    1
    Static Taint Analysis for
    JavaScript Programs
    Nabil Almashfi & Lunjin Lu

    View Slide

  2. Static Taint Analysis for JavaScript Programs / 25
    Outline
    ■ Background
    ■ Related Work
    ■ Motivation
    ■ Approach
    ■ Evaluation
    ■ Conclusion
    2

    View Slide

  3. Static Taint Analysis for JavaScript Programs / 25
    Background - JavaScript

    JavaScript is primarily a web scripting
    language.

    JavaScript code is written into an HTML page
    and it gets executed on the client-side.

    JavaScript is dynamic and prototype-based.

    Object properties can be added and deleted
    on the fly.
    3

    View Slide

  4. Static Taint Analysis for JavaScript Programs / 25
    Background - JavaScript Security

    Web applications moving towards client-side
    functionality and storage.

    Exploitable JavaScript code exposes the user
    and the system to significant damage.

    DOM-based Cross-site Scripting (XSS) is one
    of top JavaScript vulnerabilities .

    DOM-based XSS purely occurs on the
    client-side.
    4

    View Slide

  5. Static Taint Analysis for JavaScript Programs / 25
    DOM-based Cross-site Scripting (XSS)


    DOM−based XSS!


    <br/>document.write(document.URL);<br/>


    5
    Attack:
    www.site.com/page.html?default=alert(document.cookie)
    ■ Cause serious damage such as stealing user information
    or simply break the application display of data.

    View Slide

  6. Static Taint Analysis for JavaScript Programs / 25
    Related Work - ACTARUS

    ACTARUS is a static taint analysis for
    JavaScript [1].

    Build a static representation of the program
    which consists of a call graph.

    Traverse the call graph to find sources, sinks
    and sanitizers in the code and partition them
    into rules.

    These flows are potential security issues 6

    View Slide

  7. Static Taint Analysis for JavaScript Programs / 25
    Related Work - Gatekeeper

    A mostly static approach to enforce security
    and reliability policies in JavaScript
    programs.

    The policies are expressed in the form of
    succinct declarative Datalog queries.

    Programs are represented as a database of
    Datalog rules against which GateKeeper
    policies are checked.

    The policies include preventing code injection
    attempts, and cross-site scripting detection.

    Employs points-to analysis for program
    understanding, and use the analysis to detect 7

    View Slide

  8. Static Taint Analysis for JavaScript Programs / 25
    Related Work - Blended Taint Analysis

    Combines static and dynamic analysis
    approach [3].

    In the Dynamic Phase, some information that
    is not statically known are collected.

    In the Static Phase, a static taint analysis is
    run on the program.

    Solutions from both phases are combined
    into a single solution. 8

    View Slide

  9. Static Taint Analysis for JavaScript Programs / 25
    Related Work - Summary

    None of previous works differentiate between
    different rendering HTML contexts.

    They all use a constant propagation domain
    to track strings.

    Some of them combine dynamic analysis
    with static analysis whereas our word is
    purely based on static analysis.
    9

    View Slide

  10. Static Taint Analysis for JavaScript Programs / 25
    Rendering Contexts

    Common contexts in a Web page:
    - JavaScript Context (JSC)
    - HTML Element Context (HEC)
    - HTML Attribute Context (HAC)
    - HTML Event-handler Attribute Context (HHC)
    - HTML URL Attribute Context (HUC)
    - CSS attribute Context (HCC)
    10
    JSC
    HEC
    Click Here

    View Slide

  11. Static Taint Analysis for JavaScript Programs / 25
    Rendering Contexts

    Each context requires following specific
    encoding rules, defined by owasp [4].

    Three encoding mechanisms:
    - JavaScript encoding (JE)
    - HTML encoding (HE)
    - URL encoding (UE)
    ■ Primary recommendation is
    to avoid putting user
    input in JSC and HHC contexts.
    11

    View Slide

  12. Static Taint Analysis for JavaScript Programs / 25
    Motivation
    ■ The function encodeURI() does not encode the
    single quotation mark.
    12
    DOM−based XSS!


    <br/>document.write(”<a id=’ ” + encodeURI(DATA) + ” ’>Click Here</a>”);<br/>var d2 = document.getElementById(”d2”);<br/>d2.innerHTML = encodeForJS(encodeForHTML(DATA));<br/>
    Attack: 9’ href=’http://www.google.com

    View Slide

  13. Static Taint Analysis for JavaScript Programs / 25
    Background: TAJS
    ■ TAJS is a context-sensitive analyzer for JavaScript that
    supports the ECMAScript language and parts of the
    DOM and infers type information.
    ■ Construct flow graphs to represent JavaScript program
    code and performs data flow analysis.
    ■ The analysis is based on the monotone framework
    using a lattice structure.
    ■ The lattice is based on constant propagation for all the
    primitive types of JavaScript values.
    13

    View Slide

  14. Static Taint Analysis for JavaScript Programs / 25
    Approach: Taint Analysis

    TAJS
    taint
    defines a set of rules, each rule is of
    the form (S
    1
    , S
    2
    , S
    3
    )

    S
    1
    ∈ Sources, S
    2
    ∈ ℘(Sanitizers), and S
    3

    Sinks.

    Sources are either functions or object
    properties that can be controlled by the
    attacker.

    Sinks are functions or object properties where
    data can be executed

    Sanitizers are functions that transform data as
    14

    View Slide

  15. Static Taint Analysis for JavaScript Programs / 25
    Abstract Domain for Taint
    ■ Example: The following is an example of the way rules
    are defined in TAJS
    taint
    .
    ■ S
    1
    = { document.URL, location.href }
    ■ S
    2
    = { {encodeForJS(), encodeForHTML()} }
    ■ S
    3
    ={ document.write() }
    15

    View Slide

  16. Static Taint Analysis for JavaScript Programs / 25
    Abstract Domain for Taint
    ■ Abstract domain to determine the kinds of encodings a
    tainted value has gone through.
    ■ T = {HTML-encoded, JavaScript-encoded, URL-encoded}.
    ■ The abstract domain for taint analysis is ℘(T) ordered
    by ⊇.
    16

    View Slide

  17. Static Taint Analysis for JavaScript Programs / 25
    Abstract Domain for Taint
    T = { HTML-encoded, JavaScript-encoded,
    URL-encoded }
    Let String be the set of all possible strings.
    γ
    T
    (HTML-encodeed) = { HE(s) | s ∈ String }
    γ
    T
    (JavaScript-encoded) = { JE(s) | s ∈ String }
    γ
    T
    (URL-encoded) = { UE(s) | s ∈ String }
    γ : ℘(T) → ℘(String) is defined as:
    γ(X) = ∩ { γ
    T
    (t) | t ∈ X }
    α : ℘(String) → ℘(T) is defined as:
    α(S) = { t | t ∈ T ∧ S ⊆ γ (t) }
    17

    View Slide

  18. Static Taint Analysis for JavaScript Programs / 25
    String Abstract Domain (1)
    ■ TAJS
    taint
    associates every string variable with a deterministic
    finite automaton (DFA)
    ■ TAJS uses a constant propagation domain to track strings.
    ■ DFA is more precise when dealing with dynamic property
    access.
    18

    View Slide

  19. Static Taint Analysis for JavaScript Programs / 25
    String Abstract Domain (2)
    ■ In TAJS, (p) points to all properties of object (obj).
    ■ In TAJS
    taint
    , (p) is represented by gn1.
    19
    function lookup (obj, str, p) {
    while (p.length < N)
    p = str + p ;
    return obj[p];
    }
    lookup(obj, ”g”, ”1”);

    View Slide

  20. Static Taint Analysis for JavaScript Programs / 25
    ■ Benchmarks are real Web sites chosen from Alexa [5].
    ■ Most true positives caused by using the function
    encodeURI() in the wrong context.
    Evaluation
    20

    View Slide

  21. Static Taint Analysis for JavaScript Programs / 25
    Evaluation - Execution time
    21

    View Slide

  22. Static Taint Analysis for JavaScript Programs / 25
    Conclusion
    ■ We introduced a new approach to analyze JavaScript
    programs for the detection of DOM-based XSS
    vulnerability.
    ■ The approach is based on identifying various rendering
    contexts and ensuring the data has been properly
    sanitized for each context.
    ■ TAJS
    taint
    uses DFAs as a more precise string domain.
    ■ Results show TAJS
    taint
    detects DOM-based XSS
    vulnerability more precisely.
    22

    View Slide

  23. Static Taint Analysis for JavaScript Programs / 25
    Thank you!
    23

    View Slide