$30 off During Our Annual Pro Sale. View Details »

SMT-Friendly Formalization of the Solidity Memory Model

SMT-Friendly Formalization of the Solidity Memory Model

Talk @ SMT 2020 about ESOP 2020 paper, link: https://dx.doi.org/10.1007/978-3-030-44914-8_9

The paper presents a high-level formalization of the Solidity language with a focus on the memory model. The formalization is implemented in the SOLC-VERIFY verifier tool (see https://github.com/SRI-CSL/solidity).

More Decks by Critical Systems Research Group

Other Decks in Research

Transcript

  1. SMT-Friendly Formalization of
    the Solidity Memory Model
    Ákos Hajdu1, Dejan Jovanović2
    1Budapest University of Technology and Economics
    2SRI International
    Paper originally published at ESOP 2020
    link.springer.com/chapter/10.1007/978-3-030-44914-8_9
    Presented at SMT 2020, July 5th

    View Slide

  2. Solidity Smart
    Contracts
    2
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  3. 3
    • Store data (blockchain) and execute code (smart contracts)
    • No trusted central party
    • Consensus protocol
    Distributed Computing Platforms
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  4. 4
    • Conceptually a single-world-computer abstraction
    – Example: Ethereum
    Distributed Computing Platforms
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  5. 5
    Solidity Smart Contracts
    SMT-Friendly Formalization of the Solidity Memory Model
    contract DataStorage {
    }
    Complex datatype
    State variable(s): permanent
    storage (blockchain)
    Function(s): called
    with transactions
    Parameters, return values, local
    variables in transient memory
    struct Record { bool set; int[] data; }
    mapping(address=>Record) private records;
    function append(address at, int d) public {
    Record storage r = records[at];
    r.set = true;
    r.data.push(d);
    }
    function get(address at) public view returns (int[] memory ret) {
    require(isset(records[at]));
    ret = records[at].data;
    }
    function isset(Record storage r) internal view returns (bool s) {
    s = r.set;
    }
    Pointers to storage
    in internal scope

    View Slide

  6. 6
    • Bytecode-level tools
    – Slither, Mythril, …
    – Various formalizations
    – Mostly vulnerable patterns
    – Limited effectiveness and automation for high-level properties
    • Solidity-level tools
    – SMTchecker, solc-verify, VeriSol, …
    – High-level, functional properties
    – Usually based on SMT
    – Modular verification, bounded model checking, symbolic execution
    – Precise formalization required
    Verification Landscape
    SMT-Friendly Formalization of the Solidity Memory Model
    010101
    111000
    001010
    Memory model lacks detailed and effective formalization

    View Slide

  7. 7
    • Simple SMT-based program
    – Types: primitive, datatype, array
    – Variable declarations
    – Statements: assign, assume, if-then-else
    – Expressions: identifier, array read/write,
    datatype constructor, member selector,
    conditional, basic arithmetic
    • Can be expressed in any SMT-based tool
    – Boogie, Why3, Dafny, …
    – Check by translating to SSA
    Target Language
    SMT-Friendly Formalization of the Solidity Memory Model
    Point(x : int, y : int)
    pts[0] := Point(1, 2)
    pts[1].x := pts[0].x + 1
    pts : [int]Point

    View Slide

  8. Formalizing the
    Solidity memory
    model
    8
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  9. 9
    • Memory: reference semantics
    Overview
    SMT-Friendly Formalization of the Solidity Memory Model
    • Storage: value semantics
    contract C {
    struct S { int x; T[] ta; }
    struct T { int z; }
    }
    T t1;
    S s1;
    S[] sa;
    T
    t1
    S
    T
    T
    s1
    S
    T
    S
    T
    T
    sa
    T
    function f() public {
    T storage tp = sa[1].ta[2];
    }
    g(tp);
    function g(T storage t) internal {
    t.z = 5;
    }
    • Local storage pointers
    function f() public pure {
    }
    T
    S
    T
    sm1
    t
    S
    sm2
    S memory sm1 = S(1, new T[](2));
    T memory t = sm.ta[1];
    S memory sm2 = S(2, sm1.ta);
    No mixing

    View Slide

  10. 10
    • Standard heap model (per type)
    – Pointer: SMT integer
    – Struct: SMT datatype
    – Array: SMT array + length
    – No null, default values recursively
    Encoding the Memory
    SMT-Friendly Formalization of the Solidity Memory Model
    struct T { int z; }
    struct S { int x; T[] ta; }
    S memory sm1 = S(1, new T[](2));
    Tmem
    (z : int)
    Tmemarr
    (arr : [int]int, len : int)
    Smem
    (x : int, ta : int)
    heapT
    : [int]Tmem
    heapTA
    : [int]Tmemarr
    heapS
    : [int]Smem
    sm1 : int
    heapT
    [0] := Tmem
    (0)
    heapT
    [1] := Tmem
    (0)
    heapTA
    [2] := Tmemarr
    ([0, 1], 2)
    heapS
    [3] := Smem
    (1, 2)
    sm1 := 3
    sm1
    .z
    Allocation counter
    .ta[1].z;
    heapT
    [heapTA
    [heapS
    [3].ta].arr[1]]
    heapTA
    [heapS
    [3].ta]
    heapS
    [3]

    View Slide

  11. 11
    • Scope limited to a single transaction
    • Non-aliasing and new allocations
    – Require quantifiers in the general case (decidable fragment)
    Encoding the Memory
    SMT-Friendly Formalization of the Solidity Memory Model
    function f(S memory sm) {
    ... = S(...)
    }
    assume(sm < refcnt)
    assume(heapS
    [sm].ta < refcnt)
    forall 0 <= i < heapTA
    [heapS
    [sm].ta].len:
    assume(heapTA
    [heapS
    [sm].ta].arr[i] < refcnt)
    struct T { int z; }
    struct S { int x; T[] ta; }
    sm
    sm.ta
    sm.ta[i] (for each i)
    New allocations should
    not alias with sm
    Allocation counter

    View Slide

  12. 12
    • Encode with SMT datatypes without heaps
    – Non-aliasing and deep copy ensured out-of-the-box
    – Especially useful in modular verification
    – Otherwise many framing conditions for functions
    Encoding the Storage
    SMT-Friendly Formalization of the Solidity Memory Model
    struct T { int z; }
    struct S { int x; T[] ta; }
    contract C {
    T t1;
    S s1;
    S[] sa;
    }
    Tstor
    (z : int)
    Tstorarr
    (arr : [int]Tstor
    , len : int)
    Sstor
    (x : int, ta : Tstorarr
    )
    Sstorarr
    (arr : [int]Sstor
    , len : int)
    t1: Tstor
    s1: Sstor
    sa: Sstorarr
    Local storage
    pointers?

    View Slide

  13. 13
    • Storage is a finite-depth tree of values*
    • Each element identified by path  encode with SMT integer array
    Local Storage Pointers
    SMT-Friendly Formalization of the Solidity Memory Model
    *Limited support for recursive data types, not used in practice.
    contract C {
    struct T {
    int z;
    }
    struct S {
    int x;
    T t;
    T[] ta;
    }
    T t1;
    S s1;
    S[] sa;
    }
    C T
    S
    S[]
    T
    T[]
    S T
    T[] T
    T
    t1
    s1
    sa
    t
    ta [i]
    t
    ta [i]
    [i]
    0
    1
    2
    0
    1
    0
    1
    i
    i
    i

    View Slide

  14. T
    S T
    T[]
    T
    T
    t1
    s1 t
    ta [i]
    t
    0
    1 0
    1
    0
    i
    14
    • Packing: expression to SMT array
    – Fit expression to tree
    Local Storage Pointers
    SMT-Friendly Formalization of the Solidity Memory Model
    contract C {
    struct T {
    int z;
    }
    struct S {
    int x;
    T t;
    T[] ta;
    }
    T t1;
    S s1;
    S[] sa;
    }
    C
    S[] S
    T[] T
    sa
    ta [i]
    [i]
    2
    1 i
    i
    T storage t = sa[8].ta[5];
    t : [int]int
    t := [2, 8, 1, 5]

    View Slide

  15. T
    S T
    T[]
    T
    T
    t1
    s1 t
    ta [i]
    t
    0
    1 0
    1
    0
    i
    15
    • Unpacking: SMT array to expression
    – Conditional based on tree
    Local Storage Pointers
    SMT-Friendly Formalization of the Solidity Memory Model
    contract C {
    struct T {
    int z;
    }
    struct S {
    int x;
    T t;
    T[] ta;
    }
    T t1;
    S s1;
    S[] sa;
    }
    C
    S[] S
    T[] T
    sa
    ta [i]
    [i]
    2
    1 i
    i
    function f(T storage ptr) {
    ... ptr.z;
    }
    ite(ptr[0] = 0,
    t1,
    ite(ptr[0] = 1,
    ite(ptr[1] = 0,
    s1.t,
    s1.ta[ptr[2]]), … ).z

    View Slide

  16. 16
    Assignments Between Data Locations
    SMT-Friendly Formalization of the Solidity Memory Model
    • Ensured by construction
    • Ensured by pack/unpack
    • Need manual copying
    – Requires quantifiers in the general case
    LHS/RHS Storage Memory Storage ptr.
    Storage Deep copy Deep copy Deep copy
    Memory Deep copy Pointer assign Deep copy
    Storage ptr. Pointer assign Error Pointer assign

    View Slide

  17. 17
    Details in the Paper
    SMT-Friendly Formalization of the Solidity Memory Model
    arxiv.org/abs/2001.03256

    View Slide

  18. Evaluation
    18
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  19. 19
    • solc-verify (our tool) github.com/SRI-CSL/solidity
    – Modular verifier based on Boogie/SMT and the presented encoding
    • Mythril github.com/ConsenSys/mythril
    – Symbolic execution engine running over bytecode
    • VeriSol github.com/microsoft/verisol
    – Modular/BMC tool based on Boogie/SMT
    – Heap-based modeling of memory and storage
    • SMTchecker github.com/ethereum/solidity
    – SMT-based intra-function analyzer built into the compiler
    Compared Tools
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  20. 20
    • „Real world” contracts: limited for evaluating memory semantics
    – Many old versions, new features are rare
    – Many toy examples, overrepresented categories
    – Complex contracts depend on other features
    • Manually developed tests
    – 325 test cases organized into categories
    – Assign, delete, init, storage, storage pointer
    – Exercise a specific feature, check result
    with assertion
    Tests
    SMT-Friendly Formalization of the Solidity Memory Model
    contract InitMemoryArrayFixedSize {
    function test() public pure {
    int[2] memory a;
    assert(a.length == 2);
    assert(a[0] == 0);
    assert(a[1] == 0);
    }
    }
    github.com/dddejan/solidity-semantics-tests

    View Slide

  21. 21
    • Bytecode level is precise
    • solc-verify comes close
    – We have some unimplemented features
    Results
    SMT-Friendly Formalization of the Solidity Memory Model
    0 20 40 60 80 100
    myth
    verisol
    smt-checker
    solc-verify
    0 2 4 6 8 10 12 14
    myth
    verisol
    smt-checker
    solc-verify
    0 5 10 15
    myth
    verisol
    smt-checker
    solc-verify
    0 5 10 15 20 25
    myth
    verisol
    smt-checker
    solc-verify
    0 50 100 150
    myth
    verisol
    smt-checker
    solc-verify
    assign
    delete
    init
    storage
    storageptr
    Mythril bug report: github.com/ConsenSys/mythril/issues/1282

    View Slide

  22. 22
    • Low computational cost for solc-verify
    Results
    SMT-Friendly Formalization of the Solidity Memory Model
    1,00
    10,00
    100,00
    1000,00
    assigment delete init storage storageptr
    Execution time (s)
    solc-verify smt-checker verisol myth

    View Slide

  23. Summary
    23
    SMT-Friendly Formalization of the Solidity Memory Model

    View Slide

  24. 24
    • SMT-friendly formalization of the
    Solidity memory model
    – Memory: standard heap
    – Storage: values
    – Local storage pointers: encode path
    • Implementation
    – solc-verify: modular verifier
    – Extensive set of test cases
    – On par with bytecode-level tools,
    at low computational cost
    Summary
    SMT-Friendly Formalization of the Solidity Memory Model
    T
    S
    T
    S
    T
    S
    T
    T
    assigment delete init storage storageptr
    hajduakos.github.io
    csl.sri.com/users/dejan
    arxiv.org/abs/2001.03256
    github.com/SRI-CSL/solidity

    View Slide