Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TMPA-2021: Formal Methods: Theory and Practice of Linux Verification Center

TMPA-2021: Formal Methods: Theory and Practice of Linux Verification Center

Alexey Khoroshilov, ISP RAS
Formal Methods: Theory and Practice of Linux Verification Center

TMPA is an annual International Conference on Software Testing, Machine Learning and Complex Process Analysis. The conference will focus on the application of modern methods of data science to the analysis of software quality.

To learn more about Exactpro, visit our website https://exactpro.com/

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro

5206c19df417b8876825b5561344c1a0?s=128
Exactpro PRO
November 25, 2021

Exactpro
PRO

November 25, 2021
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Formal Methods:
    Theory and Practice of
    Linux Verification Center
    Alexey Khoroshilov
    [email protected]
    Ivannikov Institute for System Programming
    of the Russian Academy of Sciences
    SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS ANALYSIS
    25-27 NOVEMBER 2021

    View Slide

  2. Bug-free software

    View Slide

  3. Bug-free software

    View Slide

  4. ● In all possible configurations
    on all possible input data
    with all possible interactions with environments
    with all possible timings/preemptions/...
    ● software behaves correctly
    Formal Methods: A mathematical magic tool

    View Slide

  5. Formal Methods: How it works
    Software
    Software
    Model
    Mathematical world
    Real world

    View Slide

  6. Formal Methods: How it works
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Real world

    View Slide

  7. Formal Methods: How it works
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    satisfies to
    Real world

    View Slide

  8. Formal Methods: How it works
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world

    View Slide

  9. Formal Methods: Problem 1
    Software
    Software
    Model
    Mathematical world
    Real world

    View Slide

  10. Formal Methods: Problem 2
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Real world

    View Slide

  11. Formal Methods: Problem 3
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    satisfies to
    Real world

    View Slide

  12. Formal Methods: Problem 4
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world

    View Slide

  13. Formal Methods: Challenges
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world
    1. Transition
    2. Complexity

    View Slide

  14. Challenge 1 - Transition
    ● Transition from the informal to the formal is
    essentially informal
    M.R. Shura Bura

    View Slide

  15. Challenge 1 - Transition
    ● Transition from the informal to the formal is
    essentially informal
    M.R. Shura Bura
    Requirements
    Requirements
    Model
    ● Software behaves correctly
    ● Requirements for real software
    are complex
    JetOS (DO-178C Avionics RTOS) developed by ISPRAS
    HLR (~80%) LLR (~50%)
    Pages (A4) 1048 2620
    Elements 3041 7753
    Requirements 1359 2408
    Definitions 894 1960
    Notes 785 2383
    Sections 471 1620

    View Slide

  16. Challenge 1 - Transition
    JetOS (DO-178C Avionics RTOS) developed by ISPRAS
    HLR (~80%) LLR (~50%)
    Pages (A4) 1048 2620
    Elements 3041 7753
    Requirements 1359 2408
    Definitions 894 1960
    Notes 785 2383
    Sections 471 1620
    Source Code Statistics
    Components 57
    Functions 892
    Lines 23 KLoC
    Linux Kernel 21 640 KLoC
    OpenJDK 11 641 KLoC
    PostgreSQL 1 123 KLoC
    Statistics by
    https://www.openhub.net
    accessed 24 Nov 2021
    1 HLR per 17 LoC

    View Slide

  17. Challenge 1 - Transition
    ● Any precise enough (formal) requirements are
    complex and definitely contains bugs
    ● Mitigations:
    ● Formalize only simple properties
    ● Easily audited by experts (or even stakeholders)
    ● Avoid requirements at all, check for absent of
    typical bugs or assert in the code
    ● e.g. safety properties: buffer overrun, NULL pointer
    dereference, double free, etc.
    ● Complete specifications
    ● Checked against implementations
    ● Checked for self-consistency by tools
    ● Reviewed by experts

    View Slide

  18. Challenge 1 - Transition
    ● Any precise enough (formal) requirements are complex
    and definitely contains bugs
    ● Mitigations:
    ● Formalize only simple properties
    ● Easily audited by experts (or even stakeholders)
    ● good confidence, limited scope
    ● Avoid requirements at all, check for absent of typical bugs
    or assert in the code
    ● e.g. safety properties: buffer overrun, NULL pointer
    dereference, double free, etc.
    ● good confidence, very limited scope
    ● Complete specifications
    ● Checked against implementations
    ● Checked for self-consistency by tools
    ● Reviewed by experts
    ● questionable confidence, large scope

    View Slide

  19. Challenge 2 - Complexity

    View Slide

  20. Challenge 2 - Complexity
    Linux kernel/_do_syscall_accept call graph

    View Slide

  21. Challenge 2 - Complexity

    View Slide

  22. Challenge 2 - Complexity

    View Slide

  23. Challenge 2 - Complexity
    ● The main mitigation is abstraction
    ● Simplify and ignore irrelevant details
    ● Focus on and generalize important central
    properties and characteristics
    ● Avoid premature design and implementation
    choices

    View Slide

  24. Complexity Challenge – Pattern 1
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world
    - Translation
    - Abstraction
    - Abstraction
    - Deductive Verification
    - Model Checking

    View Slide

  25. Software model built by expert
    ● Model checking
    ● fully automated proof
    ● any incremental change in SW or Reqs can step
    over limits of the tool capacities
    ● with no good fall back
    ● Deductive verification
    ● manual decomposition
    ● automated discharge of generated verification
    conditions
    ● fall back to interactive theorem proving if VC is too
    complex
    Complexity Challenge – Pattern 1

    View Slide

  26. Software model built by expert
    ● Model checking
    ● fully automated proof
    ● any incremental change in SW or Reqs can step over limits
    of the tool capacities
    ● with no good fall back
    ● good confidence, moderate efforts, limited size
    ● Deductive verification
    ● manual decomposition
    ● automated discharge of generated verification conditions
    ● fall back to interactive theorem proving if VC is too complex
    ● good confidence, big efforts, big scope (limited by cost)
    Complexity Challenge – Pattern 1

    View Slide

  27. Complexity Challenge – Pattern 2
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world
    - Translation - Abstraction
    - Software Deductive Verification
    - abstraction at decomposition level

    View Slide

  28. Software model built by tools (white box),
    abstraction done by expert at decomposition
    level
    ● Software deductive verification
    ● manual abstraction at decomposition level
    ● e.g. pre/post conditions, loop invariants, ...
    ● automated discharge of generated verification
    conditions
    ● fall back to interactive theorem proving if VC is
    too complex
    Complexity Challenge – Pattern 2

    View Slide

  29. Software model built by tools (white box),
    abstraction done by expert at decomposition level
    ● Software deductive verification
    ● manual abstraction at decomposition level
    ● e.g. pre/post conditions, loop invariants, ...
    ● automated discharge of generated verification
    conditions
    ● fall back to interactive theorem proving if VC is too
    complex
    ● good confidence, big efforts, big scope (limited by
    cost)
    Complexity Challenge – Pattern 2

    View Slide

  30. Complexity Challenge – Pattern 3
    Software Requirements
    Software
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world
    - Translation - Abstraction
    - Software Model Checking
    - abstraction during building proof

    View Slide

  31. Software model built by tools (black box),
    abstraction done by the tools during building
    the proof
    ● Software model checking
    ● fully automated proof
    ● any incremental change in SW or Reqs can step
    over limits of the tool capacities
    ● with no good fall back
    Complexity Challenge – Pattern 3

    View Slide

  32. Software model built by tools (black box),
    abstraction done by the tools during building
    the proof
    ● Software model checking
    ● fully automated proof
    ● any incremental change in SW or Reqs can step
    over limits of the tool capacities
    ● with no good fall back
    ● good confidence, moderate efforts, limited
    complexity of code and requirements
    ● less confidence, better scope
    Complexity Challenge – Pattern 3

    View Slide

  33. Complexity Challenge – Pattern 4
    Software Requirements
    Execution
    Trace
    Model
    Requirements
    Model
    Mathematical world
    Proof
    satisfies to
    Real world
    - Translation - Abstraction
    - Trace Checkers
    Execution
    Trace

    View Slide

  34. ● Software model is not built at all,
    test execution with collection of events and
    checking them against requirements model
    ● Run Time Verification
    ● traces are checked against model automatically
    ● much easier
    ● tests prepared manually or generated from the
    requirements model
    Complexity Challenge – Pattern 4

    View Slide

  35. ● Software model is not built at all,
    test execution with collection of events and
    checking them against requirements model
    ● Run Time Verification
    ● traces are checked against model automatically
    ● much easier
    ● tests prepared manually or generated from the
    requirements model
    ● sacrificed confidence, moderate efforts,
    almost unlimited complexity
    Complexity Challenge – Pattern 4

    View Slide

  36. Challenge 2 - Complexity
    ● Deductive Verification
    ● Models Deductive Verification
    ● Software Deductive Verification
    ● good confidence, big efforts, big scope
    (limited by cost)
    ● Model Checking
    ● Traditional Model Checking
    ● Software Model Checking
    ● good confidence, moderate efforts, limited
    complexity of code and requirements
    ● Run Time Verification
    ● sacrificed confidence, moderate efforts,
    almost unlimited complexity

    View Slide

  37. founded in 2005
    ● User Space Model Based Testing
    ● Application Binary/Program Interface Stability
    ● Linux Driver Verification Program
    ● Linux File System Verification
    ● Deductive Verification of Operating Systems
    ● Model Based Access Control Testing
    Linux Verification Center

    View Slide

  38. founded in 2005
    ● User Space Model Based Testing – Pattern 4
    ● Application Binary/Program Interface Stability
    ● Linux Driver Verification Program
    ● Linux File System Verification
    ● Deductive Verification of Operating Systems
    ● Model Based Access Control Testing
    Linux Verification Center

    View Slide

  39. Open Linux VERification
    Linux Standard Base 3.1
    LSB Core ABI
    GLIBC
    libc libcrypt libdl
    libpam libz libncurses
    libm libpthread librt libutil
    LSB Core 3.1 / ISO 23360
    ABI Utilities ELF, RPM, …
    LSB C++ LSB Desktop

    View Slide

  40. ● Requirements catalogue built for LSB and POSIX
    ● 1532 interfaces
    ● 22663 elementary requirements
    ● 97 deficiencies in specification reported
    ● Formal specifications and tests developed for
    ● 1270 interface (good quality)
    ● + 260 interfaces (basic quality)
    ● 80+ bugs reported in modern distributions
    ● OLVER is a part of the official LSB Certification test suite
    http://ispras.linuxfoundation.org
    OLVER Results

    View Slide

  41. Test Suite Architecture
    Legend:
    Automatic derivation
    Pre-built
    Manual Generated
    Specification Test coverage tracker
    Test
    oracle
    Data
    model
    Mediator Mediator
    Test scenario Scenario driver Test engine
    System
    Under
    Test

    View Slide

  42. ● model based testing allows to achieve better quality
    using less resources
    ● maintenance of MBT is cheaper
    OLVER Conclusions

    View Slide

  43. ● model based testing allows to achieve better quality
    using less resources
    if you have smart test engineers
    ● maintenance of MBT is cheaper
    if you have smart test engineers
    OLVER Conclusions

    View Slide

  44. founded in 2005
    ● User Space Model Based Testing
    ● Application Binary/Program Interface Stability
    ● Linux Driver Verification Program – Pattern 3
    ● Linux File System Verification
    ● Deductive Verification of Operating Systems
    ● Model Based Access Control Testing
    Linux Verification Center

    View Slide

  45. Commit Analysis(*)
    ● All patches in stable trees (2.6.35 – 3.0) for 1
    year:
    ● 26 Oct 2010 – 26 Oct 2011
    ● 3101 patches overall
    (*) Khoroshilov A.V., Mutilin V.S., Novikov E.M. Analysis of typical faults in Linux operating system drivers.
    Proceedings of the Institute for System Programming of RAS, volume 22,
    2012, pp. 349-374. (In Russian)
    http://ispras.ru/ru/proceedings/docs/2012/22/isp_22_2012_349.pdf
    Raw data: http://linuxtesting.org/downloads/ldv-commits-analysis-2012.zip

    View Slide

  46. Taxonomy of Typical Bugs
    Rule classes Types
    Number of
    bug fixes
    Percents
    Cumulative
    total
    percents
    Correct usage of the
    Linux kernel API
    (176 ~ 50%)
    Alloc/free resources 32 ~18% ~18%
    Check parameters 25 ~14% ~32%
    Work in atomic context 19 ~11% ~43%
    Uninitialized resources 17 ~10% ~53%
    Synchronization primitives
    in one thread
    12 ~7% ~60%
    Style 10 ~6% ~65%
    Network subsystem 10 ~6% ~71%
    USB subsystem 9 ~5% ~76%
    Check return values 7 ~4% ~80%
    DMA subsystem 4 ~2% ~82%
    Core driver model 4 ~2% ~85%
    Miscellaneous 27 ~15% 100%
    Generic
    (102 ~ 30%)
    NULL pointer dereferences 31 ~30% ~30%
    Alloc/free memory 24 ~24% ~54%
    Syntax 14 ~14% ~68%
    Integer overflows 8 ~8% ~76%
    Buffer overflows 8 ~8% ~83%
    Uninitialized memory 6 ~6% ~89%
    Miscellaneous 11 ~11% 100%
    Synchronization
    (71 ~ 20%)
    Races 60 ~85% ~85%
    Deadlocks 11 ~15% 100%
    Reacha
    bility
    CPALockator
    SMG

    View Slide

  47. ● Modular framework for software verification
    ● Written in Java
    ● Open source: Apache 2.0 License
    ● Over 40 contributors so far
    from ~8 universities/institutions
    ● ~300 000 lines of code
    (170 000 without blanks and comments)
    ● Started 2007
    http://cpachecker.sosy-lab.org

    View Slide

  48. Linux Driver Verification
    http://linuxtesting.org/ldv

    View Slide

  49. ● The main strategy
    ● by natural border of loadable kernel modules
    Partitioning

    View Slide

  50. int main(int argc,char* argv[])
    {
    ...
    other_func(var);
    ...
    }
    void other_func(int v)
    {
    ...
    assert( x != NULL);
    }
    Verification Tools World

    View Slide

  51. Device Driver World
    int usbpn_open(struct net_device *dev) { ... };
    int usbpn_close(struct net_device *dev) { ... };
    struct net_device_ops usbpn_ops = {
    .ndo_open = usbpn_open, .ndo_stop = usbpn_close
    };
    int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){
    dev->netdev_ops = &usbpn_ops;
    err = register_netdev(dev);
    }
    void usbpn_disconnect(struct usb_interface *intf){...}
    struct usb_driver usbpn_struct = {
    .probe = usbpn_probe, .disconnect = usbpn_disconnect,
    };
    int __init usbpn_init(void){ return usb_register(&usbpn_struct);}
    void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );}
    module_init(usbpn_init);
    module_exit(usbpn_exit); No explicit calls to
    init/exit procedures

    View Slide

  52. Device Driver World
    int usbpn_open(struct net_device *dev) { ... };
    int usbpn_close(struct net_device *dev) { ... };
    struct net_device_ops usbpn_ops = {
    .ndo_open = usbpn_open, .ndo_stop = usbpn_close
    };
    int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){
    dev->netdev_ops = &usbpn_ops;
    err = register_netdev(dev);
    }
    void usbpn_disconnect(struct usb_interface *intf){...}
    struct usb_driver usbpn_struct = {
    .probe = usbpn_probe, .disconnect = usbpn_disconnect,
    };
    int __init usbpn_init(void){ return usb_register(&usbpn_struct);}
    void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );}
    module_init(usbpn_init);
    module_exit(usbpn_exit);
    Callback interface
    procedures registration
    No explicit calls to
    init/exit procedures

    View Slide

  53. Device Driver World
    int usbpn_open(struct net_device *dev) { ... };
    int usbpn_close(struct net_device *dev) { ... };
    struct net_device_ops usbpn_ops = {
    .ndo_open = usbpn_open, .ndo_stop = usbpn_close
    };
    int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){
    dev->netdev_ops = &usbpn_ops;
    err = register_netdev(dev);
    }
    void usbpn_disconnect(struct usb_interface *intf){...}
    struct usb_driver usbpn_struct = {
    .probe = usbpn_probe, .disconnect = usbpn_disconnect,
    };
    int __init usbpn_init(void){ return usb_register(&usbpn_struct);}
    void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );}
    module_init(usbpn_init);
    module_exit(usbpn_exit);
    Callback interface
    procedures registration
    No explicit calls to
    init/exit procedures

    View Slide

  54. Active Driver Environment Model (1)
    int main(int argc,char* argv[])
    {
    usbpn_init()
    for(;;) {
    switch(*) {
    case 0: usbpn_probe(*,*,*);break;
    case 1: usbpn_open(*,*);break;
    ...
    }
    }
    usbpn_exit();
    }

    View Slide

  55. ● Order limitation
    ● open() after probe(), but before remove()
    ● Implicit limitations
    ● read() only if open() succeed
    ● and it is specific for each class of drivers
    Active Driver Environment Model (2)

    View Slide

  56. ● Precise
    ● Complete - to avoid missing bugs
    ● Correct - to avoid false alarms
    Active Driver Environment Model (3)

    View Slide

  57. Active Driver Environment Model (3)
    ● Precise
    ● Complete - to avoid missing bugs
    ● Correct - to avoid false alarms
    ● Simple enough

    View Slide

  58. Bugs Found http://linuxtesting.org/results/ldv
    >420 patches already applied

    View Slide

  59. founded in 2005
    ● User Space Model Based Testing
    ● Application Binary/Program Interface Stability
    ● Linux Driver Verification Program
    ● Linux File System Verification
    ● Deductive Verification of Operating Systems
    ● Model Based Access Control Testing
    Linux Verification Center
    – Patterns 1 & 2

    View Slide

  60. • MLS+MIC+RBAC Access control model
    (proprietary LSM, AstraLinux)
    – 4500 LoC
    – 60 Vars, 75 Events, 248 Invariants
    – 2962 Proof Obligations
    • MLS+MIC Access control model
    (SELinux-based LSM, BaseAlt Linux)
    ● 1500 LoC
    ● 25 Vars, 35 Events, 56 Invariants
    ● 791 Proof Obligations
    Deductive Verification of Models – Pattern 1

    View Slide

  61. • Proprietary LSM (AstraLinux)
    – 10 KLoC
    – sequential properties
    – assumption of correctness of library functions
    • Collaboration with developers
    – Special tools to merge specification into code
    – Specifications are in source code repository now
    – Continuous Verification
    ● GitLab CI tasks to reverify each commit
    ● VerKer - Linux unmodified kernel library functions (25)
    https://forge.ispras.ru/projects/verker - open source
    Deductive Verification of Code – Pattern 2

    View Slide

  62. Linux Kernel Specifics
    ● Low level memory operations
    ● Arithmetics with pointers to fields of structures
    (container_of)
    ● Prefix structure casts
    ● Reinterpret casts
    ● Integer overflows
    ● Concurrency
    ● Other
    ● Functional pointers
    ● String literals

    View Slide

  63. View Slide

  64. Several 3rd Party Industrial Applications

    View Slide

  65. Industry Applications
    ● UK Air Traffic Management System
    ● 250 KLOC of logical lines of code (in Ada)
    ● proof type safety, few functional correctness code
    ● 153K VCs, of which 98.76% are proven
    automatically
    (*) Angela Wallenburg “Safe and Secure Programming Using Spark”
    – Pattern 2

    View Slide

  66. Years Tools Target code Scope Size
    Verisoft 2004-2008 Isabelle
    designed for
    verification
    hw/kernel/
    compiler/
    libraries/apps
    10 kLOC
    (kernel)
    L4.verified
    seL4 2004-2009 Isabelle
    designed for
    verification,
    performance
    oriented
    microkernel
    security model
    (no MMU)
    7.5 kLOC
    (without asm
    and boot)
    Verisoft-XT
    small-hv
    2007-2013 VCC
    designed for
    verification
    separation
    property only
    2.5 kLOC
    Verisoft-XT
    Hyper-V 2007-2013 VCC industrial
    separation
    property only
    100 kLOC
    Verisoft-XT
    PikeOS 2007-2013 VCC
    industrial,
    simplicity for
    performance
    some system
    calls
    10 KLOC
    OS Deductive Verification – Pattern 2

    View Slide

  67. Years Tools Target code Scope Size
    Verisoft 2004-2008 Isabelle
    designed for
    verification
    hw/kernel/
    compiler/
    libraries/apps
    10 kLOC
    (kernel)
    L4.verified
    seL4 2004-2009 Isabelle
    designed for
    verification,
    performance
    oriented
    microkernel
    security model
    (no MMU)
    7.5 kLOC
    (without asm
    and boot)
    Verisoft-XT
    small-hv
    2007-2013 VCC
    designed for
    verification
    separation
    property only
    2.5 kLOC
    Verisoft-XT
    Hyper-V 2007-2013 VCC industrial
    separation
    property only
    100 kLOC
    Verisoft-XT
    PikeOS 2007-2013 VCC
    industrial,
    simplicity for
    performance
    some system
    calls
    10 KLOC
    OS Deductive Verification
    2:1 overhead for specifications and 10:1 overhead for the proofs
    – Pattern 2

    View Slide

  68. Industry Applications – Software Model Checking
    – Pattern 3

    View Slide

  69. Formal Methods: Conclusions

    View Slide

  70. ● A mathematical magic tool proving that
    in all possible configurations
    on all possible input data
    with all possible interactions with environments
    with all possible timings/preemptions/...
    ● software behaves correctly
    ● Trade off between
    ● confidence
    ● feasibility
    ● cost
    Formal Methods: Conclusions

    View Slide

  71. Formal methods
    ● can be applied with various levels of resource
    investment/confidence ratio
    ● can guarantee the absence of certain defects
    under certain assumptions
    ● provides a valuable method to increase
    confidence in system reliability
    Conclusions (1)

    View Slide

  72. ● Proving absence of typical bugs/safety properties
    ● limited efforts, variable confidence, limited size of code
    ● Full-fledged formal specifications of requirements
    ● encourages an abstract view of the system
    ● often more valuable than verification itself
    ● deductive verification
    ● expensive, good confidence, limited scope
    ● run time verification
    ● moderate cost, limited confidence, unlimited size of code
    Conclusions (2)

    View Slide

  73. ● Combination of techniques, move forward each other
    ● Deductive verification
    ● Software model checking
    ● Run time verification
    ● Aspect-based requirements models
    ● Abstraction hints for tools
    ● Better automation and human-machine interface
    Future Directions

    View Slide

  74. Ivannikov Institute for System Programming of the Russian Academy of Sciences
    Thank you!
    http://linuxtesting.org
    http://ispras.ru

    View Slide

  75. Ivannikov Institute for System Programming of the Russian Academy of Sciences
    Thank you!
    Alexey Khoroshilov
    [email protected]
    http://linuxtesting.org/
    Morris Kline. “Mathematics: The Loss of Certainty” Oxford Press, 1980

    View Slide