Upgrade to Pro — share decks privately, control downloads, hide ads and more …

System-theoretic process analysis applied to software-intensive systems

System-theoretic process analysis applied to software-intensive systems

I explain STAMP/STPA, safety analysis based on system theory, and our work applying it to software-intensive systems by adding a direct connection to verification (model checking and test case generation).
This was presented at fortiss GmbH in Munich, Germany.

Stefan Wagner

April 08, 2016
Tweet

More Decks by Stefan Wagner

Other Decks in Research

Transcript

  1. You can copy, share and change, film and photograph, blog,

    live-blog and tweet this presentation given that you attribute it to its author and respect the rights and licences of its parts. based on slides by @SMEasterbrook und @ethanwhite
  2. Assumption 1: Safety is increased by increasing system or component

    reliability. If components or systems do not fail, then accidents will not occur. from: Leveson. Engineering a Safer World. MIT Press, 2011
  3. A plane not taking off because of a software check

    is total safe but not reliable.
  4. New Assumption 1: High reliability is neither necessary nor sufficient

    for safety. Assumption 1: Safety is increased by increasing system or component reliability. If components or systems do not fail, then accidents will not occur. from: Leveson. Engineering a Safer World. MIT Press, 2011
  5. Assumption 2: Accidents are caused by chains of directly related

    events. We can understand accidents and assess risk by looking at the chain of events leading to the loss. from: Leveson. Engineering a Safer World. MIT Press, 2011
  6. Subjective Selection Why do we always hear that „human error“

    of the operators, drivers or pilots caused an accident?
  7. New Assumption 2: Accidents are complex processes involving the entire

    socio-technical system. Traditional event-chain models cannot describe this process adequately. Assumption 2: Accidents are caused by chains of directly related events. We can understand accidents and assess risk by looking at the chain of events leading to the loss. from: Leveson. Engineering a Safer World. MIT Press, 2011
  8. Assumption 3: Most accidents are caused by operator error. Rewarding

    safe behavior and punishing unsafe behavior will eliminate or reduce accidents significantly. from: Leveson. Engineering a Safer World. MIT Press, 2011
  9. New Assumption 3: Operator behavior is a product of the

    environment in which it occurs. To reduce operator “error” we must change the environment in which the operator works. Assumption 3: Most accidents are caused by operator error. Rewarding safe behavior and punishing unsafe behavior will eliminate or reduce accidents significantly. from: Leveson. Engineering a Safer World. MIT Press, 2011
  10. Assumption 4: Probabilistic risk analysis based on event chains is

    the best way to assess and communicate safety and risk information. from: Leveson. Engineering a Safer World. MIT Press, 2011
  11. New Assumption 4: Risk and safety may be best understood

    and communicated in ways other than probabilistic risk analysis. Assumption 4: Probabilistic risk analysis based on event chains is the best way to assess and communicate safety and risk information. from: Leveson. Engineering a Safer World. MIT Press, 2011
  12. Software is reliable but unsafe when • The software correctly

    implements the requirements, but the specified behavior is unsafe from a system perspective. • The software requirements do not specify some particular behavior required for system safety (that is, they are incomplete). • The software has unintended (and unsafe) behavior beyond what is specified in the requirements. from: Leveson. Engineering a Safer World. MIT Press, 2011
  13. New Assumption 5: Highly reliable software is not necessarily safe.

    Increasing software reliability or reducing implementation errors will have little impact on safety. Assumption 5: Highly reliable software is safe. from: Leveson. Engineering a Safer World. MIT Press, 2011
  14. STAMP Control Process Behavior Inadequate Enforcement Hazardous Process Hierarchical Safety

    Control Structure Hazardous System State Inadequate of Safety Constraints on from: Leveson. Engineering a Safer World. MIT Press, 2011
  15. Problem Reports Operating Procedures Revised operating procedures Whistleblowers Change reports

    Certification Info. Manufacturing Management Safety Reports Policy, stds. Work Procedures safety reports audits work logs Manufacturing inspections Hazard Analyses Documentation Design Rationale Company Resources Standards Safety Policy Operations Reports Management Operations Resources Standards Safety Policy Incident Reports Risk Assessments Status Reports Safety−Related Changes Test reports Test Requirements Standards Review Results Safety Constraints Implementation Hazard Analyses Progress Reports Safety Standards Hazard Analyses Progress Reports Design, Work Instructions Change requests Audit reports Problem reports Maintenance Congress and Legislatures Legislation Company Congress and Legislatures Legislation Legal penalties Certification Standards Regulations Government Reports Lobbying Hearings and open meetings Accidents Case Law Legal penalties Certification Standards Regulations Accidents and incidents Government Reports Lobbying Hearings and open meetings Accidents Whistleblowers Change reports Maintenance Reports Operations reports Accident and incident reports Change Requests Performance Audits Hardware replacements Software revisions Hazard Analyses Operating Process Case Law SYSTEM DEVELOPMENT Insurance Companies, Courts User Associations, Unions, Industry Associations, Government Regulatory Agencies Management Management Management Project Government Regulatory Agencies Industry Associations, User Associations, Unions, Documentation and assurance and Evolution SYSTEM OPERATIONS Insurance Companies, Courts Physical Actuator(s) Incidents Operating Assumptions Process Controller Automated Human Controller(s) Sensor(s) from: Leveson. Engineering a Safer World. MIT Press, 2011
  16. Problem Reports Operating Procedures Revised erating procedures Audit reports Problem

    reports Change Requests rdware replacements Software revisions Operating Process Physical Actuator(s) Incidents Operating Assumptions Process Controller Automated Human Controller(s) Sensor(s)
  17. Controlled Process Sensors Controller Actuators Disturbances Process Outputs Process Inputs

    Measured Variables Controlled Variables Control Algorithms Set Points, Safety is a control problem. Hall Effect Sensor Volume of noise Off switch Thrust
  18. Controlled Process Sensors Controller Actuators Disturbances Process Outputs Process Inputs

    Measured Variables Controlled Variables Control Algorithms Set Points, Automatic controller Human Physical design Social control Process Safety is a control problem. Off switch Hall Effect Sensor Volume of noise Thrust
  19. Controller 2 process changes, (Flaws in creation, or adaptation) Sensor

    Component failures Incorrect or no Feedback Delays missing feedback Inadequate or control action ineffective or missing Inappropriate, Delayed operation Controller Actuator Controlled Process extermal information Control input or wrong or missing Changes over time Measurement Feedback delays information provided inaccuracies Inadequate Operation operation Inadequate 1 4 4 3 Algorithm Inadequate Control 2 Process Model inconsistent, incomplete, or incorrect 3 Unidentified or out−of−range Process output contributes to disturbance system hazard Process input missing or wrong Conflicting control actions incorrect modification Unsichere Eingaben von höheren Ebenen Unsichere Algorithmen Falsches Modell des Prozesses Falsche Prozess- ausführung
  20. Example not followed Power off Power turned off when door

    closed Applicable Not Given Incorrectly Stopped too soon Wrong Timing or order Control Action Applicable Not opened when door closed or Power not turned on Power on Power not turned off when door opened Door opened, controller waits too long to turn Power turned on while door opened Power turned on too early; door not fully closed off power Not Given or
  21. Connection to Verification Abdulkhaleq, Wagner, Leveson. A Comprehensive Safety Engineering

    Approach for Software-Intensive Systems Based on STPA. Procedia Engineering 128:2–11, 2015
  22. Prof. Dr. Stefan Wagner e-mail [email protected] phone +49 (0) 711

    685-88455 WWW www.iste.uni-stuttgart.de/se Twitter prof_wagnerst ORCID 0000-0002-5256-8429 Institute of Software Technology
  23. Pictures used in this slide deck Safety by GotCredit (https://flic.kr/p/qHCmfo,

    Got Credit) Unsafe Area by Jerome Vial under CC BY-SA 2.0 (https://flic.kr/p/71Kpk7) Airplane by StockSnap (https://pixabay.com/de/flugzeug-reisen-transport- airasia-926744/) Swiss Cheese Model by Davidmack - Own work, CC BY-SA 3.0, (https:// commons.wikimedia.org/w/index.php?curid=31679759) Die Titanic im Hafen von Southhampton - Gemeinfrei (https:// commons.wikimedia.org/w/index.php?curid=19027661) Pisa by Aaron Kreis (https://flic.kr/p/wzEw5K) Looking back by Susanne Nilsson (https://flic.kr/p/niBFZo) Concorde Cockpit by Dr. Richard Murray (https://commons.wikimedia.org/wiki/ File:Concorde_Cockpit_-_geograph.org.uk_-_1357498.jpg)