Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TMPA-2015: Implementing the MetaVCG Approach in the C-light System

Exactpro
December 01, 2015

TMPA-2015: Implementing the MetaVCG Approach in the C-light System

Alexei Promsky, Dmitry Kondtratyev, A.P. Ershov Institute of Informatics Systems, Novosibirsk

12 - 14 November 2015
Tools and Methods of Program Analysis in St. Petersburg

Exactpro

December 01, 2015
Tweet

More Decks by Exactpro

Other Decks in Science

Transcript

  1. Implementing the MetaVCG approach in the
    C-light system
    Dmitry Kondratyev Alexei Promsky
    A.P. Ershov Institute of Informatics Systems

    View full-size slide

  2. Team and Aims
    A.P. Ershov Institute of Informatics Systems, Siberian Branch
    of Russian Academy of Sciences, Novosibirsk, Russia.
    Theoretical Programming Lab.: Prof. Valery Nepomnyaschy,
    Prof. Nikolay Shilov, Igor Anureev, Alexey Promsky, Ilya
    Maryasov, Dmitry Kondratyev, ...
    Studies of theoretical foundations of informatics which can be
    applied in practical tasks such as
    modeling of sequential and parallel processes;
    semantics and specication;
    program verication
    The C program verication is one of our high-priority goals.

    View full-size slide

  3. Why the C language?
    Still popular (according to the latest TIOBE index)
    Basis for the kindred languages: (1), (3) and (4). Can we work
    with them if we are unable to verify the C programs?
    Esp. when not all problems of the C program verication are
    solved.
    The interest in the C program verication is conrmed by
    researchers: VERISOFT, WHY (Frama-C), VCC

    View full-size slide

  4. C-light and C-kernel
    Correct approaches/algorithms at every step.
    What do the founders say?
    Restrictions that contribute to provability are what
    make a programming language good. C.A.R. Hoare.
    The C-light language
    covers a major part of the C99 (C0 completely, Misra C
    almost);
    sets the evaluation order;
    avoids some low-level features.
    The C-kernel language possesses light and sound axiomatic
    semantics.
    Easy addition of new/remaining constructions?
    This is exactly what our research serves for!

    View full-size slide

  5. The C-light Verication System: overview
    /*@ requires \nothing;
    assigns e;
    ensures \result == \old(e) && e == \old(e) + 1;
    */
    e++
    /*@ requires \nothing;
    assigns e;
    ensures \result == \old(e) && e == \old(e) + 1;
    */
    (q = &e, y = *q, *q = *q + 1, y)
    MD1
    = upd(MD0
    , MeM(q), MeM(e)) ⋀
    MD2
    = upd(MD1
    , MeM(y), MD1
    (MeM(q))) ⋀
    MD = upd(MD2
    , MD2
    (MeM(q), BinOpSem(+, MD2
    (MeM(q)), 1)) ⋀
    Val = MD(MeM(y)) ⇒
    Val = \old(MD(MeM(e)) ⋀ MD(MeM(e)) = \old(MD(MeM(e))) + 1
    Annotated C-light program
    passed to translator,
    translates
    Annotated C-kernel program
    passed to VCG,
    generates
    Verification condition
    passed to Simplify or Z3,
    validates

    View full-size slide

  6. The C-light Verication System: some details

    View full-size slide

  7. The C-light Verication System: things done so far
    Verication of some challenges from the well-known
    collections: aliasing, abrupt termination, side-eects, function
    pointers.
    Specications and verication of a subset of the Standard C
    Library
    /*@ requires \valid_range(s1, 0, strlen(s2)) && valid_string(s2);
    assigns s1[0..strlen(s2)];
    ensures strcmp(s1, s2) == 0 && \result == s1;
    ensures \base_addr(\result) == \base_addr(s1);
    */
    char *strcpy(char *restrict s1, const char *restrict s2)
    Experiments on self-applicability. Our translator from C-light
    into C-kernel is implemented using Clang (C++ API),
    however, a good part of its functionality is expressible in
    C-light.

    View full-size slide

  8. The C-light Verication System: things done so far
    /*@ requires 1 <= id <= UINT_MAX;
    assigns id;
    behavior somewhere_in_the_middle:
    assumes 1 <= \old(id) < UINT_MAX;
    ensures id == \old(id) + 1 &&
    strcmp(\result, strcat("BLOCK\0",
    ltoa(\old(id)))) == 0;
    behavior too_many_blocks:
    assumes \old(id) == UINT_MAX;
    ensures \result == NULL;
    complete behaviors somewhere_in_the_middle, too_many_blocks;
    disjoint behaviors somewhere_in_the_middle, too_many_blocks;
    */
    char* getBlockID()

    View full-size slide

  9. Current task: addition of new axiomatic rules
    In practice, the axiomatic semantics of a language takes form
    of Verication Condition Generator (VCG), thus reducing the
    question of program correctness to the truth of lemmas
    (verication conditions VCs) in some applied theory.
    We would like to add easily and correctly new axioms and
    rules to our Hoare systems and, consequently, to VCG.
    The reasons:
    The complete C language or transition to C++. No doubts
    here.
    The context specic rules or even specialized VCGs. Not so
    obvious. Some examples are required.

    View full-size slide

  10. Why new axiomatic rules: example 1
    During the Library verication the following pattern was found
    swap(x, y, buf ) ≡ memcpy(buf, x, m);
    memcpy(x, y, m);
    memcpy(y, buf, m);
    The general rule for the function call looks like
    {P } f (x) {Q }
    P ⇒ (P α ∧ Q γ ⇒ Qγβ)
    {P} f (e); {Q}
    ,
    The substitutions α, β model the argument passing, whereas γ
    renames the variables and, thus, is equivalent to quantication.
    In the meantime, we can enrich the Hoare system by the following
    axiom:
    {x = x
    0
    ∧ y = y
    0
    } swap(x, y, buf ) {x = y
    0
    ∧ y = x
    0
    }

    View full-size slide

  11. Why new axiomatic rules: example 2
    Given, M is a two-dimensional matrix and e(k, i) is an expression
    depending on matrix indices k and i, consider the following triple:
    {Q(M ← rep(M, mat(e
    1
    , e
    2
    , e
    3
    , e
    4
    ), e(s, t)))}
    for(k = e
    1
    ; k <= e
    2
    ; k++)
    for(i = e
    3
    ; i <= e
    4
    ; i++)
    M[k][i] = e(k, i);
    {Q}
    where matrix rep(M, mat(e
    1
    , e
    2
    , e
    3
    , e
    4
    ), e(s, t))) results from replacement
    of all elements corresponding to sub-matrix mat(e
    1
    , e
    2
    , e
    3
    , e
    4
    ) by
    expression e.
    All these logical functions (rep, mat, etc) are dened by a set of
    axioms. For example
    rep(rep(M, S
    1
    , e(s, t)), S
    2
    , e(s, t)) = rep(M, S
    1
    ∪ S
    2
    , e(s, t))

    View full-size slide

  12. Why new axiomatic rules: example 3
    From those methods of loop invariant elimination we can step to
    1. program schemata. For example, Dijkstra's linear search
    scheme
    {P} d = d
    0
    ; while(prop(d)) d = f (d) {Q}
    where d, d
    0
    , prop, f are uninterpreted objects and
    Q : ¬prop(dk) ∧ ∀i(0 ≤ i ≤ k ⇒ prop(di )) ∧ d = dk and
    di = f (di−1
    )
    2. and even further to program transformations
    {P} A {P} {P} B {Q} £ {P} A; B {Q}
    inv ≡ P {P} if(e) A {P} £ {P} {inv}while(e) A {P}

    View full-size slide

  13. MetaVCG
    The examples above are not articial. The corresponding
    studies are being conducted in our Lab:
    Ilya Maryasov develops the Mixed-semantics approach;
    Prof. Valery Nepomnyaschy develops the approach of Finite
    iterations over data structures;
    Prof. Nikolay Shilov tries to apply the program schemata to
    verication of dynamic programming algorithms.
    Two possible ways:
    One huge VCG replenished by rules every time we apply to a
    new domain hardly a good idea.
    A collection of specialized VCGs much better.
    Finally, the error-prone process of manual implementation of
    VCG can compromise the verication.
    A possible solution the MetaVCG approach.

    View full-size slide

  14. MetaVCG: origins
    Can the correctness of a VCG be guaranteed not only by
    testing/verication but also by its construction?
    Basing on classical results by E.W. Dijkstra, R.L. London,
    D.C. Luckham etc., M. Moriconi and R. Schwartz proposed in
    1981 a method for mechanically constructing VCGs from a
    useful class of Hoare logics.
    Any VCG constructed by the method is shown to be sound
    and deduction-complete w.r.t the associated Hoare logic.

    View full-size slide

  15. MetaVCG: scheme
    Annotated
    C program
    Analysis
    and
    transformation
    Program
    in the internal
    form
    MetaVCG
    Recursively
    defined
    VCG
    Hoare system
    Verification
    conditions
    Axioms
    Proof
    environment

    View full-size slide

  16. MetaVCG: preliminaries
    Metavariables P, Q, R denoting partially interpreted rst-order
    formulas (P, P ∧ x = 5, or x = 5).
    For a Hoare triple of the form
    {P(P
    1
    , ..., Pm)} S {Q(Q
    1
    , ..., Qn)}
    where predicate symbols Pi and Qj are logically free in P and
    Q, respectively, we have
    Pi ⇐ Qj , for i ∈ {1, ..., m} and j ∈ {1, ..., n}
    Given H +
    ⇐ T, H will be called the head of the dependency
    chain and T the tail.
    For a rule of the form
    {P
    1
    } S
    1
    {Q
    1
    }, ..., {Pn} Sn {Qn}, Γ
    {P} S {Q}
    we have S Si , for i ∈ {l, ..., n} ( +).

    View full-size slide

  17. MetaVCG: preliminaries
    Function FreePreds denotes the set of logically free predicate
    symbols in a formula, a Hoare triple or a rule.
    Function FragVars denotes the set of "fragment variables" in
    the language fragment S of a Hoare triple {P} S {Q}.
    FragVars(if B then S
    1
    else S
    2
    fi) = {B, S
    1
    , S
    2
    }
    We use FreePreds and FragVars to distinguish those logically
    free variables that are bound in the program fragment when a
    rule is applied from those that must be bound by wp-calculus.

    View full-size slide

  18. MetaVCG: normal form rule
    A normal form rule is any instance N of
    {P
    1
    } S
    1
    {Q
    1
    } , ..., {Pn} Sn {Qn}, Γ
    {P} S {Q}
    that satises the following constraints:
    1. P
    1
    ,..., Pn and Q are predicate symbols free in N.
    2. Γ is a sentence in the underlying theory such that
    FreePreds(Γ) ⊆ FreePreds(N) ∪ FragVars(S).
    3. The fragment variables of each Si must be bound in S. So

    1≤i≤nFragVars(Si ) ⊆ FragVars(S).

    View full-size slide

  19. MetaVCG: normal form rule
    4. Dependency ordering. The Hoare-triple premises of N must
    satisfy two dependency constraints.
    a. Pi
    +
    ⇐ Pj ⊃ i < j
    b. T +
    ⇐ U ∧ ¬(∃R)U +
    ⇐ R ⊃ U ≡ Q ∨ U bound in N
    5. Monotonicity. Let P[P ← false, P ∈ s] denote P with the
    proper substitution of false for each predicate P in the set s.
    Then
    P[P
    1
    , ..., Pn, Q ← true] ∨
    ∀s ⊆ {P
    1
    , ..., Pn, Q} ¬P[P ← false, P ∈ s]
    This constraint must hold for Γ and for each Qi .

    View full-size slide

  20. MetaVCG: transforming proof rules into VCG
    Given any rule of the form
    {P
    1
    } S
    1
    {Q
    1
    }, ..., {Pn} Sn {Qn}, Γ
    {P} S {Q}
    wdp can be dened as follows:
    wdp(S, Q) = P[P
    1
    ← wdp(S
    1
    , Q
    1
    ), ..., Pn ← wdp(Sn, Qn)]∧
    (∀v)Γ[P
    1
    ← wdp(S
    1
    , Q
    1
    ), ..., Pn ← wdp(Sn, Qn)]
    where [P
    1
    ← t
    1
    , ..., Pn ← tn] denotes n proper substitutions carried
    out sequentially in a left-to-right order, and v is the set of all free
    logical variables in Γ.

    View full-size slide

  21. MetaVCG: transforming proof rules into VCG
    Taking as examples the classical axiom for assignment (without
    side eects)
    {P[x ← e]} x:=e {P}
    and the rule of inference for statement composition
    {P
    1
    } S
    1
    {R} {R} S
    2
    {Q}
    {P} S
    1
    ;S
    2
    {Q}
    we obtain the following predicate transformers:
    wdp(x:=e, P) = P[x ← e]
    wdp(S
    1
    ;S
    2
    , Q) = P[P ← wdp(S
    1
    , R), R ← wdp(S
    2
    , Q)] =
    wdp(S
    1
    , wdp(S
    2
    , Q))

    View full-size slide

  22. MetaVCG: general form rule
    The proof rules look rather unusual:
    {P
    1
    } S
    1
    {Q}, {P
    2
    } S
    2
    {Q}
    {B ⊃ P
    1
    ∧ ¬B ⊃ P
    2
    } if B then S
    1
    else S
    2
    fi {Q}
    {P
    1
    } S {P}, P ∧ ¬B ⊃ Q, P ∧ B ⊃ P
    1
    {P} while B inv P do S od {Q}
    Moreover, the order on the premises is required, thus,
    narrowing the class of applicable Hoare systems. By the way,
    the axiomatic semantics of the C-kernel language does not
    satisfy these requirements.
    Perhaps, we could weaken the constraints somehow?

    View full-size slide

  23. MetaVCG: general form rule
    A general form rule is any instance G of
    I
    1
    , ..., In, Γ
    {P} S {Q}
    that satises normal form constraints 1-3 and 4b, where:
    1. Each premise I is in one of the following forms
    a. {R} S {Q}
    b. {F} S {Q}
    c. {R ∧ F} S {Q}
    where, in all three cases, R is a metavariable evaluating to a
    single predicate symbol free in G, F is a metavariable
    evaluating to a formula not containing any predicate symbols
    free in G, and Q is a metavariable.

    View full-size slide

  24. MetaVCG: general form rule
    2. The relation
    +
    ⇐ is irreexive with respect to I
    1
    , ..., In.
    3. Let r be the set of predicate symbols free in the preconditions
    of I
    1
    , ..., In. Then, the following constraint on P must hold:
    P[P ← true, P ∈ r ∪ {Q}]∧
    ∀s ⊆ r ∪ {Q}¬P[P ← false, P ∈ s]
    This constraint must hold for Γ and for each Qi .
    {P ∧ B} S {P}, P ∧ ¬B ⊃ Q
    {P} while B inv P do S od {Q}
    {P ∧ B} S1 {Q}, {P ∧ ¬B} S2 {Q}
    {P} if B then S1 else S2 fi {Q}

    View full-size slide

  25. MetaVCG: transformation to normal form
    First, sort the rule according to the three classes of allowable premises,
    yielding a schema of the form
    {F
    1
    } S
    1
    {Q
    1
    }, ..., {Fj } Sj {Qj },
    {Bj+1
    } Sj+1
    {Qj+1
    }, ..., {Bk} Sk {Qk},
    {Bk+1
    ∧ Fk+1
    } Sk+1
    {Qk+1
    }, ..., {Bn ∧ Fn} Sn {Qn}, Γ
    {P} S {Q}
    We now dene two functions:
    1. Duplicates(i) = {m : |Bm| = |Bi |, j + 1 ≤ m ≤ n}, for
    j + 1 ≤ i ≤ n where, for a metavariable B, |B| denotes the
    partially interpreted rst-order formula bound to B, and
    2. MkFormula(i) =
    Pi , for j + 1 ≤ i ≤ k
    |F| ⊃ Pi , for k + 1 ≤ i ≤ n

    View full-size slide

  26. MetaVCG: transformation to normal form
    Now rewrite the sorted schema above as
    {P
    1
    } S
    1
    {Q
    1
    }, ..., {Pn} Sn {Qn},
    Γ ∧ (|F
    1
    | ⊃ P
    1
    ) ∧ ... ∧ (|Fj | ⊃ Pj )
    {P} S {Q}
    with the subsequent overall proper substitution
    [|Bi | ←
    k∈Duplicates(i)
    MkFormula(k)], for j + 1 ≤ i ≤ n
    The last step is to reorder the premises of this rule to satisfy normal
    form constraint 4a.

    View full-size slide

  27. MetaVCG: correctness
    It may be demonstrated that a VCG constructed by this method
    is sound and deduction-complete with respect to a general form
    axiomatic denition G:
    Theorem: Let G be any general form axiom system G augmented
    by the rule of consequence and the axiom {false} S {Q} , and
    let τ denote the transformation from G to the normal form, and
    suppose that T is a complete (perhaps noneective) proof system
    for the underlying theory. Then
    G {P} S {Q} i T P ⊃ wdpτ(G)
    (S, Q) .
    Note: It has nothing to do with soundness and completeness of
    the general form axiom system w.r.t. the operational denition of
    the language.

    View full-size slide

  28. MetaVCG: practice
    Hoare logics for C-kernel satises the general form constraints
    (thanks to the two-level approach).
    The prototype MetaVCG is implemented in C-light and
    displays the following features:
    ineective in some sense
    MetaVCG(H, AP) = VCG
    H
    (AP),
    but more appropriate for verication;
    bidirectional;
    partially veried.

    View full-size slide

  29. MetaVCG: the pattern language
    It would be great to provide axioms and proof rules in classical
    graphical notation:
    {Q(MD) ← upd(MD, loc(val(e, MeM..STD)), cast(e ))} e = e ; {Q}
    {P} S {I}
    (I ∧ cast(val(e, MeM..STD), type(e, MeM, Γ), int) = 0) ⇒ Q
    (I ∧ cast(val(e, MeM..STD), type(e, MeM, Γ), int) = 0) ⇒ P
    while(e) S
    but signicant eorts will be required.
    At the moment, a simple textual representation has been
    developed. The idea is that rules are patterns that must be
    matched against annotated programs. The syntax of C-light is
    accompanied by rst-order logic, whereas some syntactic sugar
    denotes regexps.

    View full-size slide

  30. MetaVCG: the pattern language
    {(any_predicate(Q))
    (MD <- upd(MD, loc(val(e, MeM..STD)),
    cast(val(e', MeM..STD),
    type(e', MeM, TP), type(e, MeM, TP))))
    }
    e = simple_expression(e');
    {any_predicate(Q)}
    {P} S {I},
    (I /\ cast(val(e, MeM..STD), type(e, MeM, TP), int) = 0) => Q,
    (I /\ cast(val(e, MeM..STD), type(e, MeM, TP), int) != 0) => P
    |-
    {any_predicate(INV)}
    while(simple_expression(e)) any_code(S)
    {any_predicate(Q)}

    View full-size slide

  31. MetaVCG: implementation
    MetaVCG(N, tree) {
    1: transform the Clang AST tree into struct program_node;
    2: transform N into collection of struct pattern_node;
    3: if (backward_strategy) goto 4 else goto 7;
    4: // wp-calculus
    5: take program_node, nd an appropriate pattern_node
    and apply the corresponding wdp;
    6: exit;
    7: // sp-calculus
    ...
    }

    View full-size slide

  32. MetaVCG: implementation
    struct pattern_node
    {
    int is_omitted;
    int has_category;
    char* category;
    int has_identifier;
    char identifier[64];
    int has_type;
    char* type;
    int has_value;
    char* value;
    int is_matched;
    int table_length;
    char match_identifiers[2][1000][64];
    int children_count;
    struct pattern_node* children[1000];
    };

    View full-size slide

  33. MetaVCG: verication
    /*@ requires \valid(pattern) && \valid(code);
    assigns pattern->table_length;
    assigns pattern->match_identifiers[0..1]
    [\old(pattern->table_length)]
    [0..\max(strlen(pattern_identifier), 63)];
    ensures strncmp(pattern->match_identifiers[0][pattern->table_length],
    pattern->identifier, 63);
    ensures strncmp(pattern->match_identifiers[1][pattern->table_length],
    pattern->identifier, 63);
    ensures pattern->table_length = \old(pattern->table_length) + 1;
    */
    void add_identifier(struct pattern_node* pattern, struct program_node* code)
    {
    strncpy(pattern->match_identifiers[0][pattern->table_length],
    pattern->identifier, 63);
    strncpy(pattern->match_identifiers[1][pattern->table_length],
    code->identifier, 63);
    pattern->table_length++;
    }

    View full-size slide

  34. Conclusion
    Results
    the axiomatic semantics of C-kernel can be automatically
    transformed into recursively dened VCG;
    MetaVCG was implemented using mixture of C and C++;
    apart from theoretical correctness we were able to partially
    verify our prototype tool.
    Plans
    the correctness theorem should be checked for the strongest
    postcondition approach;
    reducing the C++ part of implementation.

    View full-size slide

  35. Conclusion
    Questions?

    View full-size slide