Slide 1

Slide 1 text

Static Taint Analysis for JavaScript Programs / 25 1 Static Taint Analysis for JavaScript Programs Nabil Almashfi & Lunjin Lu

Slide 2

Slide 2 text

Static Taint Analysis for JavaScript Programs / 25 Outline ■ Background ■ Related Work ■ Motivation ■ Approach ■ Evaluation ■ Conclusion 2

Slide 3

Slide 3 text

Static Taint Analysis for JavaScript Programs / 25 Background - JavaScript ■ JavaScript is primarily a web scripting language. ■ JavaScript code is written into an HTML page and it gets executed on the client-side. ■ JavaScript is dynamic and prototype-based. ■ Object properties can be added and deleted on the fly. 3

Slide 4

Slide 4 text

Static Taint Analysis for JavaScript Programs / 25 Background - JavaScript Security ■ Web applications moving towards client-side functionality and storage. ■ Exploitable JavaScript code exposes the user and the system to significant damage. ■ DOM-based Cross-site Scripting (XSS) is one of top JavaScript vulnerabilities . ■ DOM-based XSS purely occurs on the client-side. 4

Slide 5

Slide 5 text

Static Taint Analysis for JavaScript Programs / 25 DOM-based Cross-site Scripting (XSS) DOM−based XSS! document.write(document.URL); 5 Attack: www.site.com/page.html?default=alert(document.cookie) ■ Cause serious damage such as stealing user information or simply break the application display of data.

Slide 6

Slide 6 text

Static Taint Analysis for JavaScript Programs / 25 Related Work - ACTARUS ■ ACTARUS is a static taint analysis for JavaScript [1]. ■ Build a static representation of the program which consists of a call graph. ■ Traverse the call graph to find sources, sinks and sanitizers in the code and partition them into rules. ■ These flows are potential security issues 6

Slide 7

Slide 7 text

Static Taint Analysis for JavaScript Programs / 25 Related Work - Gatekeeper ■ A mostly static approach to enforce security and reliability policies in JavaScript programs. ■ The policies are expressed in the form of succinct declarative Datalog queries. ■ Programs are represented as a database of Datalog rules against which GateKeeper policies are checked. ■ The policies include preventing code injection attempts, and cross-site scripting detection. ■ Employs points-to analysis for program understanding, and use the analysis to detect 7

Slide 8

Slide 8 text

Static Taint Analysis for JavaScript Programs / 25 Related Work - Blended Taint Analysis ■ Combines static and dynamic analysis approach [3]. ■ In the Dynamic Phase, some information that is not statically known are collected. ■ In the Static Phase, a static taint analysis is run on the program. ■ Solutions from both phases are combined into a single solution. 8

Slide 9

Slide 9 text

Static Taint Analysis for JavaScript Programs / 25 Related Work - Summary ■ None of previous works differentiate between different rendering HTML contexts. ■ They all use a constant propagation domain to track strings. ■ Some of them combine dynamic analysis with static analysis whereas our word is purely based on static analysis. 9

Slide 10

Slide 10 text

Static Taint Analysis for JavaScript Programs / 25 Rendering Contexts ■ Common contexts in a Web page: - JavaScript Context (JSC) - HTML Element Context (HEC) - HTML Attribute Context (HAC) - HTML Event-handler Attribute Context (HHC) - HTML URL Attribute Context (HUC) - CSS attribute Context (HCC) 10 JSC
HEC
Click Here

Slide 11

Slide 11 text

Static Taint Analysis for JavaScript Programs / 25 Rendering Contexts ■ Each context requires following specific encoding rules, defined by owasp [4]. ■ Three encoding mechanisms: - JavaScript encoding (JE) - HTML encoding (HE) - URL encoding (UE) ■ Primary recommendation is to avoid putting user input in JSC and HHC contexts. 11

Slide 12

Slide 12 text

Static Taint Analysis for JavaScript Programs / 25 Motivation ■ The function encodeURI() does not encode the single quotation mark. 12 DOM−based XSS!
document.write(”<a id=’ ” + encodeURI(DATA) + ” ’>Click Here</a>”); var d2 = document.getElementById(”d2”); d2.innerHTML = encodeForJS(encodeForHTML(DATA)); Attack: 9’ href=’http://www.google.com

Slide 13

Slide 13 text

Static Taint Analysis for JavaScript Programs / 25 Background: TAJS ■ TAJS is a context-sensitive analyzer for JavaScript that supports the ECMAScript language and parts of the DOM and infers type information. ■ Construct flow graphs to represent JavaScript program code and performs data flow analysis. ■ The analysis is based on the monotone framework using a lattice structure. ■ The lattice is based on constant propagation for all the primitive types of JavaScript values. 13

Slide 14

Slide 14 text

Static Taint Analysis for JavaScript Programs / 25 Approach: Taint Analysis ■ TAJS taint defines a set of rules, each rule is of the form (S 1 , S 2 , S 3 ) ■ S 1 ∈ Sources, S 2 ∈ ℘(Sanitizers), and S 3 ∈ Sinks. ■ Sources are either functions or object properties that can be controlled by the attacker. ■ Sinks are functions or object properties where data can be executed ■ Sanitizers are functions that transform data as 14

Slide 15

Slide 15 text

Static Taint Analysis for JavaScript Programs / 25 Abstract Domain for Taint ■ Example: The following is an example of the way rules are defined in TAJS taint . ■ S 1 = { document.URL, location.href } ■ S 2 = { {encodeForJS(), encodeForHTML()} } ■ S 3 ={ document.write() } 15

Slide 16

Slide 16 text

Static Taint Analysis for JavaScript Programs / 25 Abstract Domain for Taint ■ Abstract domain to determine the kinds of encodings a tainted value has gone through. ■ T = {HTML-encoded, JavaScript-encoded, URL-encoded}. ■ The abstract domain for taint analysis is ℘(T) ordered by ⊇. 16

Slide 17

Slide 17 text

Static Taint Analysis for JavaScript Programs / 25 Abstract Domain for Taint T = { HTML-encoded, JavaScript-encoded, URL-encoded } Let String be the set of all possible strings. γ T (HTML-encodeed) = { HE(s) | s ∈ String } γ T (JavaScript-encoded) = { JE(s) | s ∈ String } γ T (URL-encoded) = { UE(s) | s ∈ String } γ : ℘(T) → ℘(String) is defined as: γ(X) = ∩ { γ T (t) | t ∈ X } α : ℘(String) → ℘(T) is defined as: α(S) = { t | t ∈ T ∧ S ⊆ γ (t) } 17

Slide 18

Slide 18 text

Static Taint Analysis for JavaScript Programs / 25 String Abstract Domain (1) ■ TAJS taint associates every string variable with a deterministic finite automaton (DFA) ■ TAJS uses a constant propagation domain to track strings. ■ DFA is more precise when dealing with dynamic property access. 18

Slide 19

Slide 19 text

Static Taint Analysis for JavaScript Programs / 25 String Abstract Domain (2) ■ In TAJS, (p) points to all properties of object (obj). ■ In TAJS taint , (p) is represented by gn1. 19 function lookup (obj, str, p) { while (p.length < N) p = str + p ; return obj[p]; } lookup(obj, ”g”, ”1”);

Slide 20

Slide 20 text

Static Taint Analysis for JavaScript Programs / 25 ■ Benchmarks are real Web sites chosen from Alexa [5]. ■ Most true positives caused by using the function encodeURI() in the wrong context. Evaluation 20

Slide 21

Slide 21 text

Static Taint Analysis for JavaScript Programs / 25 Evaluation - Execution time 21

Slide 22

Slide 22 text

Static Taint Analysis for JavaScript Programs / 25 Conclusion ■ We introduced a new approach to analyze JavaScript programs for the detection of DOM-based XSS vulnerability. ■ The approach is based on identifying various rendering contexts and ensuring the data has been properly sanitized for each context. ■ TAJS taint uses DFAs as a more precise string domain. ■ Results show TAJS taint detects DOM-based XSS vulnerability more precisely. 22

Slide 23

Slide 23 text

Static Taint Analysis for JavaScript Programs / 25 Thank you! 23