Regulatory Compliance Software Engineering

6abee2cd1633eb4d51e261e664e6ce37?s=47 akmassey
March 27, 2013

Regulatory Compliance Software Engineering

Laws and regulations safeguard citizens’ security and privacy. For example, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) governs the security and privacy of electronic health records (EHR) systems. HIPAA violations can result in millions of dollars in penalties for non-compliance. Ensuring EHR systems are legally compliant is challenging for software engineers because the laws and regulations governing EHR systems are written by policymakers with little to no understanding of software engineering. This presentation introduces the field of Regulatory Compliance Software Engineering and discusses a particular research concern within that field: How can we help software engineers seeking to assess whether security and privacy requirements for EHR systems are legally compliant?

6abee2cd1633eb4d51e261e664e6ce37?s=128

akmassey

March 27, 2013
Tweet

Transcript

  1. 1.

    Regulatory Compliance Software Engineering Aaron Massey March 26, 2013 School

    of Interactive Computing, Georgia Institute of Technology akmassey@gatech.edu http://www.cc.gatech.edu/~akmassey @akmassey 1
  2. 2.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Overview 2 § Ethical Background and Motivation § Defining Regulatory Compliance Software Engineering § Legal Implementation Readiness Metrics § Questions
  3. 4.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    A Historical Perspective “All that may come to my knowledge in the exercise of my profession or in daily commerce with men, which ought not to be spread abroad, I will keep secret and will never reveal.” –The Hippocratic Oath § Both the Hippocratic Oath (500 BC) and the Code of Hammurabi (1772 BC) contain ethical imperatives for engineers. § Society uses laws and regulations to codify ethical standards, including Security and Privacy! 4
  4. 5.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Example Legal Domain: Health Care § Health Insurance Portability and Accountability Act (HIPAA) passed in 1996 – Regulates security and privacy for healthcare organizations – Applies to both electronic and paper-based systems – $25,000 fines per violation per year for non-criminal violations § Health Information Technology for Economic and Clinical Health (HITECH) Act passed in 2009: – Updated civil and criminal penalties – New rules for disclosures of PHI – Data breach notification § Cignet Health violation resulted in a $4.3 million dollar penalty. 5
  5. 6.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    In Layman’s Terms… § Peyton Manning § 4-time NFL MVP § Had neck surgery Spring 2011 § Hounded by reporters about his recovery. “I don't know what HIPAA stands for, but I believe in it and I practice it.” Reference: http://espn.go.com/blog/afcsouth/post/_/id/27143/mannings-stance-on-hipaa-for-it 6
  6. 7.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Regulatory Compliance Software Engineering (RCSE) is the application of a systematic approach to building, maintaining, and verifying software that must comply with laws and regulations. 7
  7. 8.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Regulatory Compliance Software Engineering (RCSE) Five Stages of Software Engineering 1.Requirements 2.Design 3.Implementation 4.Test 5.Maintenance 8 RCSE Practices ▪ Legal Requirements Elicitation or Identification [BDM06, MGL06, MPZ05, MA09a, MA09b, MEA13] ▪ Legal Requirements Triage and Prioritization [Bre09, CCG10, MOA09, MA10] ▪ Requirements Traceability [CCG10, MOH10] ▪ Legal Implementation Readiness Decisions [MA09, MOH10, MSO11, Mas12]
  8. 9.

    Surgical Resident Medical Professional Surgeon Pediatrician Requirements Document Hierarchy Legal

    Text Hierarchy Indirect Tracing Direct Tracing 9 Terminology Mapping [MOH10]
  9. 10.

    1. Map Actors 2. Map Data Objects 3. Map Actions

    Terminology Mapping 1. Map Data Objects 2. Map Actors 3. Map Actions Requirements Identification and Disambiguation Apply Inquiry Cycle Model Classify Ambiguity and Relationships Requirements Elaboration Document Priority and Origin Record Document Provenance Tracing Requirements to Legal Texts Legal Subsection Mapping Document Remaining Concerns Determine the Focus of the Legal Text Start Stop Actor-focused Data-focused 10 Traceability Methodology [MOH10]
  10. 11.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Legal Implementation Readiness A requirement is Legally Implementation Ready (LIR) when it meets or exceeds its obligations under the law. 11
  11. 12.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Example LIR Requirement Consider Requirement A: iTrust shall generate a unique user ID and default password upon account creation by a system administrator. Traces to § 164.312(a)(1) and § 164.312(a)(2)(i) 12 Relevant HIPAA Sections: (a)(1) Standard: Access control. Implement technical policies and procedures for electronic information systems that maintain electronic protected health information to allow access only to those persons or software programs that have been granted access rights as specified in § 164.308(a)(4). (2) Implementation specifications: (i) Unique user identification (Required). Assign a unique name and/ or number for identifying and tracking user identity.
  12. 13.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Example Non-LIR Requirement Consider Requirement B: iTrust shall allow an authenticated user to change their user ID and password. Traces to §164.312(a)(1) and §164.312(a)(2)(i) 13 Relevant HIPAA Sections: (a)(1) Standard: Access control. Implement technical policies and procedures for electronic information systems that maintain electronic protected health information to allow access only to those persons or software programs that have been granted access rights as specified in § 164.308(a)(4). (2) Implementation specifications: (i) Unique user identification (Required). Assign a unique name and/ or number for identifying and tracking user identity.
  13. 14.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    LIR Assessment Case Study Design § Participants – 32 graduate student participants over multiple study sessions – 3 subject matter experts – 8 legal requirements metrics § Input Materials – 31 iTrust requirements to analyze – Traceability Matrix – Text of HIPAA §164.312 • Familiarity [BA08, MA10a, MA10b, MOH10, MOA09, MA09a, MA09b] • Focuses on Technical Measures of protection • Complete, self-contained, section of the legal text 14 [MSO11]
  14. 15.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Assessment Study #1: Research Questions § Is there consensus among: – [Q1] subject matter experts about which requirements are LIR? – [Q2] graduate students about which requirements are LIR? § [Q3] Can graduate students accurately assess which requirements are LIR? § [Q4] Can we predict which requirements are LIR using attributes of those requirements? § [Q5] Are the metric categories we have established measures of whether a requirement is LIR? 15 [MSO11]
  15. 16.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Consensus among Subject Matter Experts § Q1: Is there consensus among subject matter experts on which requirements are LIR? § Result: Moderate agreement among the experts about the requirements prior to the discussion session. – κ = 0.517 (p < 0.0001) – Universal agreement on 19 of the 31 requirements 16 [MSO11]
  16. 17.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Consensus among participants § Q2: Is there consensus among participants on which requirements are LIR? § Result: Slight agreement about the requirements. – κ = 0.0792 (p < 0.0001) – Only somewhat better than “agreement” found in perfectly random responses. 17 [MSO11]
  17. 18.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Assessment of LIR § Q3: Can graduate students accurately assess which requirements are LIR? § Used 50% as the cutoff for voting on the status of requirements § Result: Students cannot accurately assess the LIR status of a requirement and are more likely to miss requirements that are not LIR. – Sensitivity = 0.875 – Specificity = 0.2 – Agreement 54.8% 18 [MSO11]
  18. 19.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Legal Texts are Hierarchical a. Lorem ipsum dolor sit amet. (1) Except when: ed do eiusmod. (2) Incididunt ut labore et dolore. i. Magna aliqua. ii. Ut enim ad: Section (f)(2). (3) Quis nostrud exercitation. i. Fugiat nulla pariatur. ii. Consectetur adipisicing. (4) Ullamco laboris nisi ut aliquip. b. Ex ea commodo consequat. c. Duis aute irure dolor. Even if we know nothing about the meaning of the law, we can still extract some meaning from the structure. 19 [MA10]
  19. 20.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Legal Requirements Metrics § Dependency Metrics estimate the extent to which one requirement is dependent on another requirement. § Complexity Metrics estimate the amount of work required to implement a requirement. § Maturity Metrics estimate the extent to which a requirement can be refined. 20 [MA10]
  20. 21.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Dependency Metrics Category Variable Name Description Dependency SM Subsections Mapped The number of subsections to which a requirement maps. Dependency C Cross- References The number of cross- references found within subsections to which a requirement maps. 21 [MA10]
  21. 22.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Example Dependency Metric: Subsections Mapped 22 a. Lorem ipsum dolor sit amet. (1) Except when: ed do eiusmod. (2) Incididunt ut labore et dolore. i. Magna aliqua. ii. Ut enim ad: Section (f)(2). (3) Quis nostrud exercitation. i. Fugiat nulla pariatur. ii. Consectetur adipisicing. (4) Ullamco laboris nisi ut aliquip. b. Ex ea commodo consequat. c. Duis aute irure dolor. Name: Sample Requirement Description: Occaecat cupidatat non. Legal Subsections: (a)(1) and (a)(2)(ii) Subsections Mapped = 2 [MA10]
  22. 23.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Complexity Metrics Category Variable Name Description Complexity NW Number of Words The number of words found within subsections to which a requirement maps. Complexity NS Number of Sentences The number of sentences found within subsections to which a requirement maps. Complexity SC Subsection Count The number of subsections recursively counted within the subsections to which a requirement maps. Complexity E Exceptions The number of exceptions within subsections to which a requirement maps. 23 [MA10]
  23. 24.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Example Complexity Metric: Subsection Count – SC 24 a. Lorem ipsum dolor sit amet. (1) Except when: ed do eiusmod. (2) Incididunt ut labore et dolore. i. Magna aliqua. ii. Ut enim ad: Section (f)(2). (3) Quis nostrud exercitation. i. Fugiat nulla pariatur. ii. Consectetur adipisicing. (4) Ullamco laboris nisi ut aliquip. b. Ex ea commodo consequat. c. Duis aute irure dolor. Name: Sample Requirement Description: Occaecat cupidatat non. Legal Subsections: (a)(1) and (a)(2)(ii) Subsection Count = 2 [MA10]
  24. 25.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Maturity Metrics Category Variable Name Description Maturity SD Subsection Depth The deepest-level subsection to which a requirement maps (SCD) minus SM. Maturity SF Subsection Fulfillment Percentage The percentage of mapped subsections in the highest- level sections to which a requirement maps. 25 [MA10]
  25. 26.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Example Maturity Metric: Subsection Depth 26 a. Lorem ipsum dolor sit amet. (1) Except when: ed do eiusmod. (2) Incididunt ut labore et dolore. i. Magna aliqua. ii. Ut enim ad: Section (f)(2). (3) Quis nostrud exercitation. i. Fugiat nulla pariatur. ii. Consectetur adipisicing. (4) Ullamco laboris nisi ut aliquip. b. Ex ea commodo consequat. c. Duis aute irure dolor. Name: Sample Requirement Description: Occaecat cupidatat non. Legal Subsections: (a)(1) and (a)(2)(ii) Subsection Depth = 3 - 2 = 1 [MA10]
  26. 27.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Statistical Models § Created a logistic regression model for each data set against the consensus SME responses § Each logistic regression model used 10-fold cross validation: – Partition data into 10 sets – Use 9 sets for training, and 1 for prediction – Repeat 10 times – Average the results for the final prediction model 27 [MSO11]
  27. 28.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Using attributes to predict LIR § Q4: Can we predict which requirements are LIR using attributes of those requirements? § Result: The logistic regression model built on our legal requirements metrics exhibited fair agreement with the expert opinion. – Sensitivity = 0.625, Specificity = 0.80, and κ = 0.35 (p < 0.0001) § Model is more likely to miss LIR requirements than non-LIR requirements. The metrics can be useful. 28 [MSO11]
  28. 29.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Legal Requirements Metrics Categories § Q5: How do the categories for the legal requirements metrics affect whether a given requirement is LIR? § If the coefficients of the logistisc regression function are negative, then higher values mean the requirement is less likely to be LIR. § If the coefficients are positive, then higher values mean the requirement is more likely to be LIR. Term Coefficient Sign Dependency Negative Complexity Negative Maturity Positive 29 [MSO11]
  29. 30.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Wideband Delphi Case Study Research Questions § Study Goal: Further validate our previous results § [Q7] Can graduate students working together using a Wideband Delphi method accurately assess which requirements are LIR? – Q7 mirrors Q3 from the previous study. The difference is this uses Wideband Delphi for consensus and that study used 50% voting § [Q8] What is the extent of the discussion on requirements during the application of the Wideband Delphi method? 30
  30. 31.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Wideband Delphi Case Study Design § 14 graduate student participants – All participants made an initial determination for each of the 31 requirements. – For each requirement, participants either: • Achieved unanimous consensus that the requirement was LIR • Achieved unanimous consensus that the requirement was not LIR • Were unable to achieve unanimous consensus, which everyone agreed meant the requirement should be considered not LIR § Two Consensus Sessions – 13 participants discussed 26 requirements in the first session – 12 participants discussed the remaining 5 requirements in the second session • 11 participants from the first session; 1 participant who missed the first session 31
  31. 32.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Wideband Delphi Assessment of LIR § Q7: Can graduate students working together using a Wideband Delphi method accurately assess which requirements are LIR? § Result: Students cannot accurately assess the LIR status of a requirement and are more likely to miss requirements that are LIR. – Sensitivity = 0.313, Specificity = 0.8, and Agreement 54.8% § The participants were much more conservative when working together to achieve consensus than they were individually. 32
  32. 33.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Results: Consensus among Wideband Delphi Participants § Q8: What is the extent of the discussion on requirements during the application of the Wideband Delphi method? § Result: Fair agreement among the participants about the requirements prior to the discussion session. – κ = 0.252 (p < 0.0001) – Recall: Experts from LIR Assessment Case Study started at κ = 0.517 (p < 0.0001) § Unable to achieve consensus on 6 of the 31 requirements after discussion 33
  33. 34.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    RCSE Contribution Summary § Presented, briefly, an empirically evaluated methodology for tracing software requirements to legal texts. § Demonstrated that graduate students trained in software engineering and requirements engineering are ill- equipped to make Legal Implementation Readiness decisions. § Explained a set of eight empirically evaluated legal requirements metrics that can be used to estimate the the dependency, complexity, and maturity of software requirements that must comply with laws and regulations. 34
  34. 35.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Future Work § Examine traceability from other software artifacts to relevant subsections of legal texts. § Examine automated techniques for generating traceability links. – If successful, only a set of requirements and a relevant legal text would be needed to perform legal requirements triage using legal requirements metrics. § Employ natural language processing methodologies to create even more accurate legal requirements metrics. § Conduct empirical studies of software engineering practitioners and legal domain experts to further validate the findings presented here. 35
  35. 36.

    © 2006-2013 Aaron Massey et al., Georgia Institute of Technology

    Big Picture Takeaways § RCSE is a young, interdisciplinary field with lots of exciting research opportunities in security and privacy. § Software engineering graduate students are ill- prepared to make legal implementation readiness decisions with any confidence. § Subject matter experts must be involved in legal compliance decisions. 36
  36. 37.

    Thank you! Questions? Aaron Massey March 26, 2013 School of

    Interactive Computing, Georgia Institute of Technology akmassey@gatech.edu http://www.cc.gatech.edu/~akmassey @akmassey 37