Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TMPA-2021: Formal Methods: Theory and Practice of Linux Verification Center

Exactpro
November 25, 2021

TMPA-2021: Formal Methods: Theory and Practice of Linux Verification Center

Alexey Khoroshilov, ISP RAS
Formal Methods: Theory and Practice of Linux Verification Center

TMPA is an annual International Conference on Software Testing, Machine Learning and Complex Process Analysis. The conference will focus on the application of modern methods of data science to the analysis of software quality.

To learn more about Exactpro, visit our website https://exactpro.com/

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro

5206c19df417b8876825b5561344c1a0?s=128
Exactpro PRO
November 25, 2021

Exactpro

November 25, 2021
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Formal Methods: Theory and Practice of Linux Verification Center Alexey

    Khoroshilov [email protected] Ivannikov Institute for System Programming of the Russian Academy of Sciences SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS ANALYSIS 25-27 NOVEMBER 2021
  2. • In all possible configurations on all possible input data

    with all possible interactions with environments with all possible timings/preemptions/... • software behaves correctly Formal Methods: A mathematical magic tool
  3. Formal Methods: How it works Software Requirements Software Model Requirements

    Model Mathematical world Proof satisfies to Real world
  4. Formal Methods: Challenges Software Requirements Software Model Requirements Model Mathematical

    world Proof satisfies to Real world 1. Transition 2. Complexity
  5. Challenge 1 - Transition • Transition from the informal to

    the formal is essentially informal M.R. Shura Bura
  6. Challenge 1 - Transition • Transition from the informal to

    the formal is essentially informal M.R. Shura Bura Requirements Requirements Model • Software behaves correctly • Requirements for real software are complex JetOS (DO-178C Avionics RTOS) developed by ISPRAS HLR (~80%) LLR (~50%) Pages (A4) 1048 2620 Elements 3041 7753 Requirements 1359 2408 Definitions 894 1960 Notes 785 2383 Sections 471 1620
  7. Challenge 1 - Transition JetOS (DO-178C Avionics RTOS) developed by

    ISPRAS HLR (~80%) LLR (~50%) Pages (A4) 1048 2620 Elements 3041 7753 Requirements 1359 2408 Definitions 894 1960 Notes 785 2383 Sections 471 1620 Source Code Statistics Components 57 Functions 892 Lines 23 KLoC Linux Kernel 21 640 KLoC OpenJDK 11 641 KLoC PostgreSQL 1 123 KLoC Statistics by https://www.openhub.net accessed 24 Nov 2021 1 HLR per 17 LoC
  8. Challenge 1 - Transition • Any precise enough (formal) requirements

    are complex and definitely contains bugs • Mitigations: • Formalize only simple properties • Easily audited by experts (or even stakeholders) • Avoid requirements at all, check for absent of typical bugs or assert in the code • e.g. safety properties: buffer overrun, NULL pointer dereference, double free, etc. • Complete specifications • Checked against implementations • Checked for self-consistency by tools • Reviewed by experts
  9. Challenge 1 - Transition • Any precise enough (formal) requirements

    are complex and definitely contains bugs • Mitigations: • Formalize only simple properties • Easily audited by experts (or even stakeholders) • good confidence, limited scope • Avoid requirements at all, check for absent of typical bugs or assert in the code • e.g. safety properties: buffer overrun, NULL pointer dereference, double free, etc. • good confidence, very limited scope • Complete specifications • Checked against implementations • Checked for self-consistency by tools • Reviewed by experts • questionable confidence, large scope
  10. Challenge 2 - Complexity • The main mitigation is abstraction

    • Simplify and ignore irrelevant details • Focus on and generalize important central properties and characteristics • Avoid premature design and implementation choices
  11. Complexity Challenge – Pattern 1 Software Requirements Software Model Requirements

    Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Abstraction - Deductive Verification - Model Checking
  12. Software model built by expert • Model checking • fully

    automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back • Deductive verification • manual decomposition • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex Complexity Challenge – Pattern 1
  13. Software model built by expert • Model checking • fully

    automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back • good confidence, moderate efforts, limited size • Deductive verification • manual decomposition • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex • good confidence, big efforts, big scope (limited by cost) Complexity Challenge – Pattern 1
  14. Complexity Challenge – Pattern 2 Software Requirements Software Model Requirements

    Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Software Deductive Verification - abstraction at decomposition level
  15. Software model built by tools (white box), abstraction done by

    expert at decomposition level • Software deductive verification • manual abstraction at decomposition level • e.g. pre/post conditions, loop invariants, ... • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex Complexity Challenge – Pattern 2
  16. Software model built by tools (white box), abstraction done by

    expert at decomposition level • Software deductive verification • manual abstraction at decomposition level • e.g. pre/post conditions, loop invariants, ... • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex • good confidence, big efforts, big scope (limited by cost) Complexity Challenge – Pattern 2
  17. Complexity Challenge – Pattern 3 Software Requirements Software Model Requirements

    Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Software Model Checking - abstraction during building proof
  18. Software model built by tools (black box), abstraction done by

    the tools during building the proof • Software model checking • fully automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back Complexity Challenge – Pattern 3
  19. Software model built by tools (black box), abstraction done by

    the tools during building the proof • Software model checking • fully automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back • good confidence, moderate efforts, limited complexity of code and requirements • less confidence, better scope Complexity Challenge – Pattern 3
  20. Complexity Challenge – Pattern 4 Software Requirements Execution Trace Model

    Requirements Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Trace Checkers Execution Trace
  21. • Software model is not built at all, test execution

    with collection of events and checking them against requirements model • Run Time Verification • traces are checked against model automatically • much easier • tests prepared manually or generated from the requirements model Complexity Challenge – Pattern 4
  22. • Software model is not built at all, test execution

    with collection of events and checking them against requirements model • Run Time Verification • traces are checked against model automatically • much easier • tests prepared manually or generated from the requirements model • sacrificed confidence, moderate efforts, almost unlimited complexity Complexity Challenge – Pattern 4
  23. Challenge 2 - Complexity • Deductive Verification • Models Deductive

    Verification • Software Deductive Verification • good confidence, big efforts, big scope (limited by cost) • Model Checking • Traditional Model Checking • Software Model Checking • good confidence, moderate efforts, limited complexity of code and requirements • Run Time Verification • sacrificed confidence, moderate efforts, almost unlimited complexity
  24. founded in 2005 • User Space Model Based Testing •

    Application Binary/Program Interface Stability • Linux Driver Verification Program • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center
  25. founded in 2005 • User Space Model Based Testing –

    Pattern 4 • Application Binary/Program Interface Stability • Linux Driver Verification Program • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center
  26. Open Linux VERification Linux Standard Base 3.1 LSB Core ABI

    GLIBC libc libcrypt libdl libpam libz libncurses libm libpthread librt libutil LSB Core 3.1 / ISO 23360 ABI Utilities ELF, RPM, … LSB C++ LSB Desktop
  27. • Requirements catalogue built for LSB and POSIX • 1532

    interfaces • 22663 elementary requirements • 97 deficiencies in specification reported • Formal specifications and tests developed for • 1270 interface (good quality) • + 260 interfaces (basic quality) • 80+ bugs reported in modern distributions • OLVER is a part of the official LSB Certification test suite http://ispras.linuxfoundation.org OLVER Results
  28. Test Suite Architecture Legend: Automatic derivation Pre-built Manual Generated Specification

    Test coverage tracker Test oracle Data model Mediator Mediator Test scenario Scenario driver Test engine System Under Test
  29. • model based testing allows to achieve better quality using

    less resources • maintenance of MBT is cheaper OLVER Conclusions
  30. • model based testing allows to achieve better quality using

    less resources if you have smart test engineers • maintenance of MBT is cheaper if you have smart test engineers OLVER Conclusions
  31. founded in 2005 • User Space Model Based Testing •

    Application Binary/Program Interface Stability • Linux Driver Verification Program – Pattern 3 • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center
  32. Commit Analysis(*) • All patches in stable trees (2.6.35 –

    3.0) for 1 year: • 26 Oct 2010 – 26 Oct 2011 • 3101 patches overall (*) Khoroshilov A.V., Mutilin V.S., Novikov E.M. Analysis of typical faults in Linux operating system drivers. Proceedings of the Institute for System Programming of RAS, volume 22, 2012, pp. 349-374. (In Russian) http://ispras.ru/ru/proceedings/docs/2012/22/isp_22_2012_349.pdf Raw data: http://linuxtesting.org/downloads/ldv-commits-analysis-2012.zip
  33. Taxonomy of Typical Bugs Rule classes Types Number of bug

    fixes Percents Cumulative total percents Correct usage of the Linux kernel API (176 ~ 50%) Alloc/free resources 32 ~18% ~18% Check parameters 25 ~14% ~32% Work in atomic context 19 ~11% ~43% Uninitialized resources 17 ~10% ~53% Synchronization primitives in one thread 12 ~7% ~60% Style 10 ~6% ~65% Network subsystem 10 ~6% ~71% USB subsystem 9 ~5% ~76% Check return values 7 ~4% ~80% DMA subsystem 4 ~2% ~82% Core driver model 4 ~2% ~85% Miscellaneous 27 ~15% 100% Generic (102 ~ 30%) NULL pointer dereferences 31 ~30% ~30% Alloc/free memory 24 ~24% ~54% Syntax 14 ~14% ~68% Integer overflows 8 ~8% ~76% Buffer overflows 8 ~8% ~83% Uninitialized memory 6 ~6% ~89% Miscellaneous 11 ~11% 100% Synchronization (71 ~ 20%) Races 60 ~85% ~85% Deadlocks 11 ~15% 100% Reacha bility CPALockator SMG
  34. • Modular framework for software verification • Written in Java

    • Open source: Apache 2.0 License • Over 40 contributors so far from ~8 universities/institutions • ~300 000 lines of code (170 000 without blanks and comments) • Started 2007 http://cpachecker.sosy-lab.org
  35. int main(int argc,char* argv[]) { ... other_func(var); ... } void

    other_func(int v) { ... assert( x != NULL); } Verification Tools World
  36. Device Driver World int usbpn_open(struct net_device *dev) { ... };

    int usbpn_close(struct net_device *dev) { ... }; struct net_device_ops usbpn_ops = { .ndo_open = usbpn_open, .ndo_stop = usbpn_close }; int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){ dev->netdev_ops = &usbpn_ops; err = register_netdev(dev); } void usbpn_disconnect(struct usb_interface *intf){...} struct usb_driver usbpn_struct = { .probe = usbpn_probe, .disconnect = usbpn_disconnect, }; int __init usbpn_init(void){ return usb_register(&usbpn_struct);} void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );} module_init(usbpn_init); module_exit(usbpn_exit); No explicit calls to init/exit procedures
  37. Device Driver World int usbpn_open(struct net_device *dev) { ... };

    int usbpn_close(struct net_device *dev) { ... }; struct net_device_ops usbpn_ops = { .ndo_open = usbpn_open, .ndo_stop = usbpn_close }; int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){ dev->netdev_ops = &usbpn_ops; err = register_netdev(dev); } void usbpn_disconnect(struct usb_interface *intf){...} struct usb_driver usbpn_struct = { .probe = usbpn_probe, .disconnect = usbpn_disconnect, }; int __init usbpn_init(void){ return usb_register(&usbpn_struct);} void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );} module_init(usbpn_init); module_exit(usbpn_exit); Callback interface procedures registration No explicit calls to init/exit procedures
  38. Device Driver World int usbpn_open(struct net_device *dev) { ... };

    int usbpn_close(struct net_device *dev) { ... }; struct net_device_ops usbpn_ops = { .ndo_open = usbpn_open, .ndo_stop = usbpn_close }; int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){ dev->netdev_ops = &usbpn_ops; err = register_netdev(dev); } void usbpn_disconnect(struct usb_interface *intf){...} struct usb_driver usbpn_struct = { .probe = usbpn_probe, .disconnect = usbpn_disconnect, }; int __init usbpn_init(void){ return usb_register(&usbpn_struct);} void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );} module_init(usbpn_init); module_exit(usbpn_exit); Callback interface procedures registration No explicit calls to init/exit procedures
  39. Active Driver Environment Model (1) int main(int argc,char* argv[]) {

    usbpn_init() for(;;) { switch(*) { case 0: usbpn_probe(*,*,*);break; case 1: usbpn_open(*,*);break; ... } } usbpn_exit(); }
  40. • Order limitation • open() after probe(), but before remove()

    • Implicit limitations • read() only if open() succeed • and it is specific for each class of drivers Active Driver Environment Model (2)
  41. • Precise • Complete - to avoid missing bugs •

    Correct - to avoid false alarms Active Driver Environment Model (3)
  42. Active Driver Environment Model (3) • Precise • Complete -

    to avoid missing bugs • Correct - to avoid false alarms • Simple enough
  43. founded in 2005 • User Space Model Based Testing •

    Application Binary/Program Interface Stability • Linux Driver Verification Program • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center – Patterns 1 & 2
  44. • MLS+MIC+RBAC Access control model (proprietary LSM, AstraLinux) – 4500

    LoC – 60 Vars, 75 Events, 248 Invariants – 2962 Proof Obligations • MLS+MIC Access control model (SELinux-based LSM, BaseAlt Linux) • 1500 LoC • 25 Vars, 35 Events, 56 Invariants • 791 Proof Obligations Deductive Verification of Models – Pattern 1
  45. • Proprietary LSM (AstraLinux) – 10 KLoC – sequential properties

    – assumption of correctness of library functions • Collaboration with developers – Special tools to merge specification into code – Specifications are in source code repository now – Continuous Verification • GitLab CI tasks to reverify each commit • VerKer - Linux unmodified kernel library functions (25) https://forge.ispras.ru/projects/verker - open source Deductive Verification of Code – Pattern 2
  46. Linux Kernel Specifics • Low level memory operations • Arithmetics

    with pointers to fields of structures (container_of) • Prefix structure casts • Reinterpret casts • Integer overflows • Concurrency • Other • Functional pointers • String literals
  47. Industry Applications • UK Air Traffic Management System • 250

    KLOC of logical lines of code (in Ada) • proof type safety, few functional correctness code • 153K VCs, of which 98.76% are proven automatically (*) Angela Wallenburg “Safe and Secure Programming Using Spark” – Pattern 2
  48. Years Tools Target code Scope Size Verisoft 2004-2008 Isabelle designed

    for verification hw/kernel/ compiler/ libraries/apps 10 kLOC (kernel) L4.verified seL4 2004-2009 Isabelle designed for verification, performance oriented microkernel security model (no MMU) 7.5 kLOC (without asm and boot) Verisoft-XT small-hv 2007-2013 VCC designed for verification separation property only 2.5 kLOC Verisoft-XT Hyper-V 2007-2013 VCC industrial separation property only 100 kLOC Verisoft-XT PikeOS 2007-2013 VCC industrial, simplicity for performance some system calls 10 KLOC OS Deductive Verification – Pattern 2
  49. Years Tools Target code Scope Size Verisoft 2004-2008 Isabelle designed

    for verification hw/kernel/ compiler/ libraries/apps 10 kLOC (kernel) L4.verified seL4 2004-2009 Isabelle designed for verification, performance oriented microkernel security model (no MMU) 7.5 kLOC (without asm and boot) Verisoft-XT small-hv 2007-2013 VCC designed for verification separation property only 2.5 kLOC Verisoft-XT Hyper-V 2007-2013 VCC industrial separation property only 100 kLOC Verisoft-XT PikeOS 2007-2013 VCC industrial, simplicity for performance some system calls 10 KLOC OS Deductive Verification 2:1 overhead for specifications and 10:1 overhead for the proofs – Pattern 2
  50. • A mathematical magic tool proving that in all possible

    configurations on all possible input data with all possible interactions with environments with all possible timings/preemptions/... • software behaves correctly • Trade off between • confidence • feasibility • cost Formal Methods: Conclusions
  51. Formal methods • can be applied with various levels of

    resource investment/confidence ratio • can guarantee the absence of certain defects under certain assumptions • provides a valuable method to increase confidence in system reliability Conclusions (1)
  52. • Proving absence of typical bugs/safety properties • limited efforts,

    variable confidence, limited size of code • Full-fledged formal specifications of requirements • encourages an abstract view of the system • often more valuable than verification itself • deductive verification • expensive, good confidence, limited scope • run time verification • moderate cost, limited confidence, unlimited size of code Conclusions (2)
  53. • Combination of techniques, move forward each other • Deductive

    verification • Software model checking • Run time verification • Aspect-based requirements models • Abstraction hints for tools • Better automation and human-machine interface Future Directions
  54. Ivannikov Institute for System Programming of the Russian Academy of

    Sciences Thank you! http://linuxtesting.org http://ispras.ru
  55. Ivannikov Institute for System Programming of the Russian Academy of

    Sciences Thank you! Alexey Khoroshilov [email protected] http://linuxtesting.org/ Morris Kline. “Mathematics: The Loss of Certainty” Oxford Press, 1980