TMPA-2021: Formal Methods: Theory and Practice of Linux Verification Center

Formal Methods: Theory and Practice of Linux Verification Center Alexey
Khoroshilov [email protected] Ivannikov Institute for System Programming of the Russian Academy of Sciences SOFTWARE TESTING, MACHINE LEARNING AND COMPLEX PROCESS ANALYSIS 25-27 NOVEMBER 2021

Bug-free software

• In all possible configurations on all possible input data
with all possible interactions with environments with all possible timings/preemptions/... • software behaves correctly Formal Methods: A mathematical magic tool

Formal Methods: How it works Software Software Model Mathematical world
Real world

Formal Methods: How it works Software Requirements Software Model Requirements
Model Mathematical world Real world

Model Mathematical world satisfies to Real world

Model Mathematical world Proof satisfies to Real world

Formal Methods: Problem 1 Software Software Model Mathematical world Real
world

Formal Methods: Problem 2 Software Requirements Software Model Requirements Model
Mathematical world Real world

Mathematical world satisfies to Real world

Mathematical world Proof satisfies to Real world

Formal Methods: Challenges Software Requirements Software Model Requirements Model Mathematical
world Proof satisfies to Real world 1. Transition 2. Complexity

Challenge 1 - Transition • Transition from the informal to
the formal is essentially informal M.R. Shura Bura

Challenge 1 - Transition • Transition from the informal to
the formal is essentially informal M.R. Shura Bura Requirements Requirements Model • Software behaves correctly • Requirements for real software are complex JetOS (DO-178C Avionics RTOS) developed by ISPRAS HLR (~80%) LLR (~50%) Pages (A4) 1048 2620 Elements 3041 7753 Requirements 1359 2408 Definitions 894 1960 Notes 785 2383 Sections 471 1620

Challenge 1 - Transition JetOS (DO-178C Avionics RTOS) developed by
ISPRAS HLR (~80%) LLR (~50%) Pages (A4) 1048 2620 Elements 3041 7753 Requirements 1359 2408 Definitions 894 1960 Notes 785 2383 Sections 471 1620 Source Code Statistics Components 57 Functions 892 Lines 23 KLoC Linux Kernel 21 640 KLoC OpenJDK 11 641 KLoC PostgreSQL 1 123 KLoC Statistics by https://www.openhub.net accessed 24 Nov 2021 1 HLR per 17 LoC

Challenge 1 - Transition • Any precise enough (formal) requirements
are complex and definitely contains bugs • Mitigations: • Formalize only simple properties • Easily audited by experts (or even stakeholders) • Avoid requirements at all, check for absent of typical bugs or assert in the code • e.g. safety properties: buffer overrun, NULL pointer dereference, double free, etc. • Complete specifications • Checked against implementations • Checked for self-consistency by tools • Reviewed by experts

Challenge 1 - Transition • Any precise enough (formal) requirements
are complex and definitely contains bugs • Mitigations: • Formalize only simple properties • Easily audited by experts (or even stakeholders) • good confidence, limited scope • Avoid requirements at all, check for absent of typical bugs or assert in the code • e.g. safety properties: buffer overrun, NULL pointer dereference, double free, etc. • good confidence, very limited scope • Complete specifications • Checked against implementations • Checked for self-consistency by tools • Reviewed by experts • questionable confidence, large scope

Challenge 2 - Complexity

Challenge 2 - Complexity Linux kernel/_do_syscall_accept call graph

Challenge 2 - Complexity

Challenge 2 - Complexity • The main mitigation is abstraction
• Simplify and ignore irrelevant details • Focus on and generalize important central properties and characteristics • Avoid premature design and implementation choices

Complexity Challenge – Pattern 1 Software Requirements Software Model Requirements
Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Abstraction - Deductive Verification - Model Checking

Software model built by expert • Model checking • fully
automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back • Deductive verification • manual decomposition • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex Complexity Challenge – Pattern 1

Software model built by expert • Model checking • fully
automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back • good confidence, moderate efforts, limited size • Deductive verification • manual decomposition • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex • good confidence, big efforts, big scope (limited by cost) Complexity Challenge – Pattern 1

Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Software Deductive Verification - abstraction at decomposition level

Software model built by tools (white box), abstraction done by
expert at decomposition level • Software deductive verification • manual abstraction at decomposition level • e.g. pre/post conditions, loop invariants, ... • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex Complexity Challenge – Pattern 2

Software model built by tools (white box), abstraction done by
expert at decomposition level • Software deductive verification • manual abstraction at decomposition level • e.g. pre/post conditions, loop invariants, ... • automated discharge of generated verification conditions • fall back to interactive theorem proving if VC is too complex • good confidence, big efforts, big scope (limited by cost) Complexity Challenge – Pattern 2

Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Software Model Checking - abstraction during building proof

Software model built by tools (black box), abstraction done by
the tools during building the proof • Software model checking • fully automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back Complexity Challenge – Pattern 3

Software model built by tools (black box), abstraction done by
the tools during building the proof • Software model checking • fully automated proof • any incremental change in SW or Reqs can step over limits of the tool capacities • with no good fall back • good confidence, moderate efforts, limited complexity of code and requirements • less confidence, better scope Complexity Challenge – Pattern 3

Complexity Challenge – Pattern 4 Software Requirements Execution Trace Model
Requirements Model Mathematical world Proof satisfies to Real world - Translation - Abstraction - Trace Checkers Execution Trace

• Software model is not built at all, test execution
with collection of events and checking them against requirements model • Run Time Verification • traces are checked against model automatically • much easier • tests prepared manually or generated from the requirements model Complexity Challenge – Pattern 4

• Software model is not built at all, test execution
with collection of events and checking them against requirements model • Run Time Verification • traces are checked against model automatically • much easier • tests prepared manually or generated from the requirements model • sacrificed confidence, moderate efforts, almost unlimited complexity Complexity Challenge – Pattern 4

Challenge 2 - Complexity • Deductive Verification • Models Deductive
Verification • Software Deductive Verification • good confidence, big efforts, big scope (limited by cost) • Model Checking • Traditional Model Checking • Software Model Checking • good confidence, moderate efforts, limited complexity of code and requirements • Run Time Verification • sacrificed confidence, moderate efforts, almost unlimited complexity

founded in 2005 • User Space Model Based Testing •
Application Binary/Program Interface Stability • Linux Driver Verification Program • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center

founded in 2005 • User Space Model Based Testing –
Pattern 4 • Application Binary/Program Interface Stability • Linux Driver Verification Program • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center

Open Linux VERification Linux Standard Base 3.1 LSB Core ABI
GLIBC libc libcrypt libdl libpam libz libncurses libm libpthread librt libutil LSB Core 3.1 / ISO 23360 ABI Utilities ELF, RPM, … LSB C++ LSB Desktop

• Requirements catalogue built for LSB and POSIX • 1532
interfaces • 22663 elementary requirements • 97 deficiencies in specification reported • Formal specifications and tests developed for • 1270 interface (good quality) • + 260 interfaces (basic quality) • 80+ bugs reported in modern distributions • OLVER is a part of the official LSB Certification test suite http://ispras.linuxfoundation.org OLVER Results

Test Suite Architecture Legend: Automatic derivation Pre-built Manual Generated Specification
Test coverage tracker Test oracle Data model Mediator Mediator Test scenario Scenario driver Test engine System Under Test

• model based testing allows to achieve better quality using
less resources • maintenance of MBT is cheaper OLVER Conclusions

• model based testing allows to achieve better quality using
less resources if you have smart test engineers • maintenance of MBT is cheaper if you have smart test engineers OLVER Conclusions

Application Binary/Program Interface Stability • Linux Driver Verification Program – Pattern 3 • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center

Commit Analysis(*) • All patches in stable trees (2.6.35 –
3.0) for 1 year: • 26 Oct 2010 – 26 Oct 2011 • 3101 patches overall (*) Khoroshilov A.V., Mutilin V.S., Novikov E.M. Analysis of typical faults in Linux operating system drivers. Proceedings of the Institute for System Programming of RAS, volume 22, 2012, pp. 349-374. (In Russian) http://ispras.ru/ru/proceedings/docs/2012/22/isp_22_2012_349.pdf Raw data: http://linuxtesting.org/downloads/ldv-commits-analysis-2012.zip

Taxonomy of Typical Bugs Rule classes Types Number of bug
fixes Percents Cumulative total percents Correct usage of the Linux kernel API (176 ~ 50%) Alloc/free resources 32 ~18% ~18% Check parameters 25 ~14% ~32% Work in atomic context 19 ~11% ~43% Uninitialized resources 17 ~10% ~53% Synchronization primitives in one thread 12 ~7% ~60% Style 10 ~6% ~65% Network subsystem 10 ~6% ~71% USB subsystem 9 ~5% ~76% Check return values 7 ~4% ~80% DMA subsystem 4 ~2% ~82% Core driver model 4 ~2% ~85% Miscellaneous 27 ~15% 100% Generic (102 ~ 30%) NULL pointer dereferences 31 ~30% ~30% Alloc/free memory 24 ~24% ~54% Syntax 14 ~14% ~68% Integer overflows 8 ~8% ~76% Buffer overflows 8 ~8% ~83% Uninitialized memory 6 ~6% ~89% Miscellaneous 11 ~11% 100% Synchronization (71 ~ 20%) Races 60 ~85% ~85% Deadlocks 11 ~15% 100% Reacha bility CPALockator SMG

• Modular framework for software verification • Written in Java
• Open source: Apache 2.0 License • Over 40 contributors so far from ~8 universities/institutions • ~300 000 lines of code (170 000 without blanks and comments) • Started 2007 http://cpachecker.sosy-lab.org

Linux Driver Verification http://linuxtesting.org/ldv

• The main strategy • by natural border of loadable
kernel modules Partitioning

int main(int argc,char* argv[]) { ... other_func(var); ... } void
other_func(int v) { ... assert( x != NULL); } Verification Tools World

Device Driver World int usbpn_open(struct net_device *dev) { ... };
int usbpn_close(struct net_device *dev) { ... }; struct net_device_ops usbpn_ops = { .ndo_open = usbpn_open, .ndo_stop = usbpn_close }; int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){ dev->netdev_ops = &usbpn_ops; err = register_netdev(dev); } void usbpn_disconnect(struct usb_interface *intf){...} struct usb_driver usbpn_struct = { .probe = usbpn_probe, .disconnect = usbpn_disconnect, }; int __init usbpn_init(void){ return usb_register(&usbpn_struct);} void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );} module_init(usbpn_init); module_exit(usbpn_exit); No explicit calls to init/exit procedures

Device Driver World int usbpn_open(struct net_device *dev) { ... };
int usbpn_close(struct net_device *dev) { ... }; struct net_device_ops usbpn_ops = { .ndo_open = usbpn_open, .ndo_stop = usbpn_close }; int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id){ dev->netdev_ops = &usbpn_ops; err = register_netdev(dev); } void usbpn_disconnect(struct usb_interface *intf){...} struct usb_driver usbpn_struct = { .probe = usbpn_probe, .disconnect = usbpn_disconnect, }; int __init usbpn_init(void){ return usb_register(&usbpn_struct);} void __exit usbpn_exit(void){usb_deregister(&usbpn_struct );} module_init(usbpn_init); module_exit(usbpn_exit); Callback interface procedures registration No explicit calls to init/exit procedures

Active Driver Environment Model (1) int main(int argc,char* argv[]) {
usbpn_init() for(;;) { switch(*) { case 0: usbpn_probe(*,*,*);break; case 1: usbpn_open(*,*);break; ... } } usbpn_exit(); }

• Order limitation • open() after probe(), but before remove()
• Implicit limitations • read() only if open() succeed • and it is specific for each class of drivers Active Driver Environment Model (2)

• Precise • Complete - to avoid missing bugs •
Correct - to avoid false alarms Active Driver Environment Model (3)

Active Driver Environment Model (3) • Precise • Complete -
to avoid missing bugs • Correct - to avoid false alarms • Simple enough

Bugs Found http://linuxtesting.org/results/ldv >420 patches already applied

Application Binary/Program Interface Stability • Linux Driver Verification Program • Linux File System Verification • Deductive Verification of Operating Systems • Model Based Access Control Testing Linux Verification Center – Patterns 1 & 2

• MLS+MIC+RBAC Access control model (proprietary LSM, AstraLinux) – 4500
LoC – 60 Vars, 75 Events, 248 Invariants – 2962 Proof Obligations • MLS+MIC Access control model (SELinux-based LSM, BaseAlt Linux) • 1500 LoC • 25 Vars, 35 Events, 56 Invariants • 791 Proof Obligations Deductive Verification of Models – Pattern 1

• Proprietary LSM (AstraLinux) – 10 KLoC – sequential properties
– assumption of correctness of library functions • Collaboration with developers – Special tools to merge specification into code – Specifications are in source code repository now – Continuous Verification • GitLab CI tasks to reverify each commit • VerKer - Linux unmodified kernel library functions (25) https://forge.ispras.ru/projects/verker - open source Deductive Verification of Code – Pattern 2

Linux Kernel Specifics • Low level memory operations • Arithmetics
with pointers to fields of structures (container_of) • Prefix structure casts • Reinterpret casts • Integer overflows • Concurrency • Other • Functional pointers • String literals

Several 3rd Party Industrial Applications

Industry Applications • UK Air Traffic Management System • 250
KLOC of logical lines of code (in Ada) • proof type safety, few functional correctness code • 153K VCs, of which 98.76% are proven automatically (*) Angela Wallenburg “Safe and Secure Programming Using Spark” – Pattern 2

Years Tools Target code Scope Size Verisoft 2004-2008 Isabelle designed
for verification hw/kernel/ compiler/ libraries/apps 10 kLOC (kernel) L4.verified seL4 2004-2009 Isabelle designed for verification, performance oriented microkernel security model (no MMU) 7.5 kLOC (without asm and boot) Verisoft-XT small-hv 2007-2013 VCC designed for verification separation property only 2.5 kLOC Verisoft-XT Hyper-V 2007-2013 VCC industrial separation property only 100 kLOC Verisoft-XT PikeOS 2007-2013 VCC industrial, simplicity for performance some system calls 10 KLOC OS Deductive Verification – Pattern 2

Years Tools Target code Scope Size Verisoft 2004-2008 Isabelle designed
for verification hw/kernel/ compiler/ libraries/apps 10 kLOC (kernel) L4.verified seL4 2004-2009 Isabelle designed for verification, performance oriented microkernel security model (no MMU) 7.5 kLOC (without asm and boot) Verisoft-XT small-hv 2007-2013 VCC designed for verification separation property only 2.5 kLOC Verisoft-XT Hyper-V 2007-2013 VCC industrial separation property only 100 kLOC Verisoft-XT PikeOS 2007-2013 VCC industrial, simplicity for performance some system calls 10 KLOC OS Deductive Verification 2:1 overhead for specifications and 10:1 overhead for the proofs – Pattern 2

Industry Applications – Software Model Checking – Pattern 3

Formal Methods: Conclusions

• A mathematical magic tool proving that in all possible
configurations on all possible input data with all possible interactions with environments with all possible timings/preemptions/... • software behaves correctly • Trade off between • confidence • feasibility • cost Formal Methods: Conclusions

Formal methods • can be applied with various levels of
resource investment/confidence ratio • can guarantee the absence of certain defects under certain assumptions • provides a valuable method to increase confidence in system reliability Conclusions (1)

• Proving absence of typical bugs/safety properties • limited efforts,
variable confidence, limited size of code • Full-fledged formal specifications of requirements • encourages an abstract view of the system • often more valuable than verification itself • deductive verification • expensive, good confidence, limited scope • run time verification • moderate cost, limited confidence, unlimited size of code Conclusions (2)

• Combination of techniques, move forward each other • Deductive
verification • Software model checking • Run time verification • Aspect-based requirements models • Abstraction hints for tools • Better automation and human-machine interface Future Directions

Ivannikov Institute for System Programming of the Russian Academy of
Sciences Thank you! http://linuxtesting.org http://ispras.ru

Ivannikov Institute for System Programming of the Russian Academy of
Sciences Thank you! Alexey Khoroshilov [email protected] http://linuxtesting.org/ Morris Kline. “Mathematics: The Loss of Certainty” Oxford Press, 1980

TMPA-2021: Formal Methods: Theory and Practice ...

TMPA-2021: Formal Methods: Theory and Practice of Linux Verification Center

More Decks by Exactpro

Other Decks in Technology

Featured

Transcript