Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Expression Tree Rebuild in Matlab

David
January 30, 2019

Expression Tree Rebuild in Matlab

In this work, an analysis to perform tree expression reconstruction, and redundant statement elimination is presented for the Matlab language, using the McLab framework. The Matlab language is a popular high-level numerical language use by many scientist and engineers. As a proprietary software, it becomes important to have alternative compilers and execution environments. To make this process efficient, the compiler must have an appropriate IR and many optimizations that are amenable to Matlab. To this end, the McLab framework was brought forward as an Inter/Intra procedural Analysis framework, with its corresponding 3-address-code IR, TameIR. This TameIR framework, although very useful for analysis of code, in terms of generating, the 3-address nature of it generates many unnecessary locals. In this work, an analysis to get rid of unnecessary locals, by means of rebuilding expression trees is performed.

David

January 30, 2019
Tweet

More Decks by David

Other Decks in Programming

Transcript

  1. Expression Tree Rebuild Optimization Sable Research Group McGill University David

    Herrera [email protected] January 30, 2019 David Herrera (McGill University) MatWably January 30, 2019 1 / 28
  2. Re-Cap - 3-address-code Tamer transforms the Matlab AST into three-address-code

    where every expression is made up of only NameExpr, (variables names). MatWably receives this 3-address-code and uses this to generate the wasm binary. David Herrera (McGill University) MatWably January 30, 2019 3 / 28
  3. Three Address Example Original Matlab Code 1 f u n

    c t i o n y = c o l l a t z ( n ) 2 y = 0; 3 w h i l e n ˜= 1 4 i f mod(n , 2) == 0 5 n = n / 2 ; 6 e l s e 7 n = 3∗n + 1 ; 8 end 9 y = y+1; 10 end 11 end TameIR 3-address Matlab Code 1 f u n c t i o n [ y ] = c o l l a t z ( n ) 2 y = 0; 3 mc t15 = 1; 4 [ mc t14 ] = ne (n , mc t15 ) ; 5 w h i l e mc t14 6 mc t7 = 2; 7 [ mc t4 ] = mod(n , mc t7 ) ; 8 mc t12 = 0; 9 [ mc t11 ] = eq ( mc t4 , mc t12 ) ; 10 i f mc t11 11 mc t8 = 2; 12 [ n ] = mrdivide (n , mc t8 ) ; 13 e l s e 14 mc t9 = 3; 15 [ mc t6 ] = mtimes ( mc t9 , n ) ; 16 mc t10 = 1; 17 [ n ] = p l u s ( mc t6 , mc t10 ) ; 18 end 19 mc t13 = 1; 20 [ y ] = p l u s ( y , mc t13 ) ; 21 mc t15 = 1; 22 [ mc t14 ] = ne (n , mc t15 ) ; 23 end 24 end David Herrera (McGill University) MatWably January 30, 2019 4 / 28
  4. Resulting generated wasm TameIR 3-address Matlab Code f u n

    c t i o n [ y ] = c o l l a t z ( n ) y = 0; mc t15 = 1; [ mc t14 ] = ne (n , mc t15 ) ; w h i l e mc t14 mc t7 = 2; [ mc t4 ] = mod(n , mc t7 ) ; mc t12 = 0; [ mc t11 ] = eq ( mc t4 , mc t12 ) ; i f mc t11 %. . . MatWably wasm (func $collatz_S (param $n_f64 f64)(result f64) f64.const 0.0 set_local $y_f64 f64.const 1.0 set_local $mc_t15_f64 get_local $n_f64 get_local $mc_t15_f64 f64.ne set_local $mc_t14_i32 get_local $mc_t14_i32 if loop $loop_mc_t16 f64.const 2.0 set_local $mc_t7_f64 get_local $n_f64 get_local $mc_t7_f64 call $mod_SS set_local $mc_t4_f64 f64.const 0.0 set_local $mc_t12_f64 get_local $mc_t4_f64 get_local $mc_t12_f64 f64.eq set_local $mc_t11_i32 get_local $mc_t11_i32 if ;;.... David Herrera (McGill University) MatWably January 30, 2019 5 / 28
  5. Problem Generated code contains a lot of locals that are

    either not used, or simply contain a constant value. WebAssembly engines cannot optimize on start-up time, since the code has not been seen by the engines. Bad Performance from: No elimination of unnecessary locals. A sub-optimal register allocation strategy. Consequence: Unnecessary loads and stores are executed Bad register allocation due to unnecessary locals results in spilling to memory David Herrera (McGill University) MatWably January 30, 2019 6 / 28
  6. Problem - wasm performance - Program 1 Without useless set

    local instruction:5.2(0.3)µs (func (param i32)(result i32) (local $i i32)(local $temp i32) loop ;; Missing useless set_local $temp (set_local $i (i32.add (get_local $i )(i32.const 1))) (br_if 0 (i32.lt_s (get_local $i)( get_local 0))) end i32.const 3 ) (export "func" (func 0)) With useless set local instruction: 17.7(1.9)µs (func (param i32)(result i32) (local $i i32)(local $temp i32) loop i32.const 5 set_local $temp (set_local $i (i32.add (get_local $i )(i32.const 1))) (br_if 0 (i32.lt_s (get_local $i)( get_local 0))) end i32.const 3 ) (export "func" (func 0)) David Herrera (McGill University) MatWably January 30, 2019 8 / 28
  7. Problem - wasm performance - Program 2 Without useless set

    local instruction: 37.2(2.1)µs (func (param i32)(result i32) (local $i i32)(local $temp i32) loop get_local $i i32.const 5 i32.add set_local $temp (set_local $i (i32.add (get_local $i )(i32.const 1))) (br_if 0 (i32.lt_s (get_local $i)( get_local 0))) end i32.const 3 ) (export "func" (func 0)) With useless set local instruction: 50.4(2.3)µs (func (param i32)(result i32) (local $i i32)(local $temp i32) loop i32.const 5 set_local $temp get_local $i get_local $temp i32.add set_local $temp (set_local $i (i32.add (get_local $i )(i32.const 1))) (br_if 0 (i32.lt_s (get_local $i)( get_local 0))) end i32.const 3 ) (export "func" (func 0)) David Herrera (McGill University) MatWably January 30, 2019 9 / 28
  8. Question Can we help start-up time by rebuilding these expression

    trees from the Matlab side? Expression Tree Rebuild Optimization (ETRO): Analysis and transformation that collects NameExpressions in TameIR and rebuilds the expression trees from them if possible. 1 f u n c t i o n [ c]= func ( ) 2 a = 1 3 b = 2 4 d = a + b 5 e = 4 6 c = d∗e 7 end → 1 f u n c t i o n [ c]= func () 2 c = (1+2)∗4 3 end David Herrera (McGill University) MatWably January 30, 2019 10 / 28
  9. ETRO - Design Goals Must get rid of unnecessary locals

    correctly Must build expression trees correctly Must conserve TameIR 3-address-code. TameIR 3-address-code is still a very useful for format for code generation. Rather than modifying the TameIR AST. The analysis must result in extra data structures that aid in generating the expression trees from NameExpressions. David Herrera (McGill University) MatWably January 30, 2019 11 / 28
  10. ETRO - Data Structures To achieve this two data structures

    are used in the analysis: A map from variable-uses to corresponding expressions, use to expr map A Set of statements to eliminate, redundant stmts set The expressions will be reconstructed by recursively visiting the use to expr map. The declaration statements to delete are skipped during generation by checking if they are included in the redundant stmts set David Herrera (McGill University) MatWably January 30, 2019 12 / 28
  11. ETRO - Example 1 What is the correct ETRO transformation

    in this case? 1 f u n c t i o n f () 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 a = 5 8 e = a<b 9 end 10 end f u n c t i o n f () c = p l u s (3 ,4) e = 3 < 4 w h i l e e e = 5 < 4 end end David Herrera (McGill University) MatWably January 30, 2019 13 / 28
  12. ETRO - Example 1 What is the correct ETRO transformation

    in this case? 1 f u n c t i o n f () 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 a = 5 8 e = a<b 9 end 10 end f u n c t i o n f () c = p l u s (3 ,4) e = 3 < 4 w h i l e e e = 5 < 4 end end David Herrera (McGill University) MatWably January 30, 2019 13 / 28
  13. ETRO - Example 1 What is the correct ETRO transformation

    in this case? 1 f u n c t i o n f () 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 a = 5 8 e = a<b 9 end 10 end f u n c t i o n f () c = p l u s (3 ,4) e = 3 < 4 w h i l e e e = 5 < 4 end end Here statements a = 3 and a = 4 make up the redundant stmts set. Meanwhile, the variable-uses in lines 5 and 8, make up the use to expr mapping i.e. 4 : a → 3, 4 : b → 4, 5 : a → 3, 5 : b → 4, 8 : a → 5, 8 : b → 4 David Herrera (McGill University) MatWably January 30, 2019 14 / 28
  14. ETRO - Example 2 What if we exchange the statements

    in the while-loop? 1 f u n c t i o n f () 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 e = a<b 8 a = 5 9 end 10 end f u n c t i o n f () a = 3 c = p l u s (3 ,4) e = 3 < 4 w h i l e e e = a < 4 a = 5 end end David Herrera (McGill University) MatWably January 30, 2019 15 / 28
  15. ETRO - Example 2 What if we exchange the statements

    in the while-loop? 1 f u n c t i o n f () 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 e = a<b 8 a = 5 9 end 10 end f u n c t i o n f () a = 3 c = p l u s (3 ,4) e = 3 < 4 w h i l e e e = a < 4 a = 5 end end David Herrera (McGill University) MatWably January 30, 2019 15 / 28
  16. ETRO - Example 2 What went wrong? 1 f u

    n c t i o n f ( ) 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 e = a<b 8 a = 5 9 end 10 end There are two definitions that could specify the value of the a variable-use on line 7. Lesson: A variable-use is considered ambiguous if there is more than one possible definition for that variable-use. In this case we cannot simply replace the use by its defining expression. David Herrera (McGill University) MatWably January 30, 2019 16 / 28
  17. ETRO - Example 2 What went wrong? 1 f u

    n c t i o n f ( ) 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 e = a < b 6 w h i l e e 7 e = a<b 8 a = 5 9 end 10 end There are two definitions that could specify the value of the a variable-use on line 7. Lesson: A variable-use is considered ambiguous if there is more than one possible definition for that variable-use. In this case we cannot simply replace the use by its defining expression. David Herrera (McGill University) MatWably January 30, 2019 16 / 28
  18. ETRO - Example 2 What is the rule for in

    the analysis then? For each variable declaration, If every use of that variable is unambiguous: Map every variable-use to the corresponding expression. Add the defining statement to the set of redundant statements. If there is at least one use that is ambiguous Map every unambiguous variable-use to the corresponding expression. Do not add the variable declaration statement involved in the ambiguity to the set of redundant statements. David Herrera (McGill University) MatWably January 30, 2019 17 / 28
  19. ETRO - Example 2 What is the rule for in

    the analysis then? If there are no ambiguous definitions of uses. Add all the uses to the expression of the defining statement Add the defining statement to the set of redundant statements. If there are ambiguous definitions of uses. Only map uses to expressions for unambiguous variable-uses Do not add the statements involved in the ambiguity to the set of redundant statements. Is this correct or complete? Can we simply use a ReachingDefinition analysis to achieve the desire outcome? David Herrera (McGill University) MatWably January 30, 2019 18 / 28
  20. ETRO - Example 2 What is the rule for in

    the analysis then? If there are no ambiguous definitions of uses. Add all the uses to the expression of the defining statement Add the defining statement to the set of redundant statements. If there are ambiguous definitions of uses. Only map uses to expressions for unambiguous variable-uses Do not add the statements involved in the ambiguity to the set of redundant statements. Is this correct or complete? Can we simply use a ReachingDefinition analysis to achieve the desire outcome? No! David Herrera (McGill University) MatWably January 30, 2019 19 / 28
  21. ETRO - Example 3 Consider the following: 1 f u

    n c t i o n f () 2 a = 3 3 b = 4 4 c = p l u s ( a , b ) 5 a (1) = 5 6 e = a < b 7 w h i l e e 8 a = 5 9 e = a<b 10 end 11 end Applying the previous specified rule, the modification of array a results in incorrectly replacing variable-use of variable a in line 6 with the definition in line 2. To fix this, we need to know if a given variable a has been modified between program point, p1, at the definition of variable a and each of corresponding program points for the unique uses of that variable definition. David Herrera (McGill University) MatWably January 30, 2019 20 / 28
  22. ETRO - General Rule What is the general rule then?

    A variable-use at program point p, may be replaced by the corresponding expression that defines the variable-use if: 1 The variable-use does not have an ambiguous definition. i.e. there is one and only one definition that could correspond to that variable-use. 2 The defining variable is not modified between the definition of the variable at point p, and the program point of the variable-use. A variable declaration may be added to the redundant statements and thus eliminated if: 1 Every use of that variable is unambiguous. i.e. every use has a unique definition, which corresponds to the given variable declaration. 2 The defining variable is not part of either the output parameters, or the space of globals variables in the context. David Herrera (McGill University) MatWably January 30, 2019 21 / 28
  23. A little detour - Modified Variable Analysis We want to

    find the variables that are defined, and not modified at any program point p. Approximation: We approximate this with a set of statements at every program point. This set of statements includes variable definitions, s, of type ‘a = ...‘, where the a array has not been modified between the particular definition and program point p. Problem Definition: A variable definition i defined at program point d reaches program point p if in all paths from d to p, the variable definition i is not modified, or redefined. Merge Operations: Let P1 and P2 be two predecessor nodes at node p. We use set intersection to get the resulting set. Starting Approx.:The out set of the entry node is the empty set as we are not interested in the parameters of the function. Every other statement Si is approximated as out(Si) = . David Herrera (McGill University) MatWably January 30, 2019 22 / 28
  24. Modified Variable Analysis - Rules TIRFunction: ([] = func(a,b,c)): Do

    nothing! TIRCopyStmt:(a = b): Kill any definition of ’a’, add new statement to the set TIRMJCopyStmt: (a = copy(b)): Kill any definition of ’a’, add new statement to set TIRCallStmt: (c = call(...args)) Kill any definition of ’c’, add new statement to the variables TIRArrayGetStmt: (b = a(i,j,k)), Kill any definition of ’b’, add new statement to set TIRArraySetStmt: a(i,j,k) = b, Kill any definition of ’a’ TIRLiteralStmt: a = 3, a = ”a”, Kill any definition of ’a’, add new statement to set. TIRReturn: Do nothing! TIRBreak: Do nothing! TIRContinue: Do nothing! David Herrera (McGill University) MatWably January 30, 2019 23 / 28
  25. Back To Modified Variable Analysis... Use both ReachingDefinitions and ModifiedVariableAnalysis

    to build the use to expr map and the redundant stmts set. Go through each statement with a variable definition and check all uses, from the uses determine ambiguity using ReachingDefinitions and if the definition has been modified in-between using the ModifiedVariableAnalysis. David Herrera (McGill University) MatWably January 30, 2019 24 / 28
  26. Specifics on interesting statements TIRCallStmt a = call(c, d, e):

    Only consider this statement in the analysis if: 1 Variable a is only used once. 2 The call is pure. i.e. it does not have any side-effects. 3 If we can determine the call is simple, e.g ones(), one that returns a constant, we may be lenient on the first condition. TIRCopyStmt a = b: Only consider removal of this statement in the analysis if: 1 The conditions of ambiguity and not-modification hold for every use of variable ’a’ (as required before). 2 When we replace each ’a’ variable-use by the ’b’ expression, the corresponding ’b’ variable-use is also unambiguous and unmodified. David Herrera (McGill University) MatWably January 30, 2019 25 / 28
  27. Other Details Globals complicate things as we require inter-procedural analysis,

    therefore we ignore variables that are globals. Return variables must also not be eliminated. At each return point we must have a defined output variable, or at least an unambiguous/unmodified resulting defining expression. This is a little bit of a pain as they are not explicit uses in Matlab... In my analysis I have ignored them, but with a little bit more effort they could be added. Last Note: This could be done together combining the flow analysis of both ReachingDefinitions and the ModifiedVariableAnalysis (UnmodifiedReachingDefinitions). David Herrera (McGill University) MatWably January 30, 2019 26 / 28
  28. Sable Group - Latest Conquest (Two days of work) David

    Herrera (McGill University) MatWably January 30, 2019 28 / 28