Slide 1

Slide 1 text

When Lawyers Talk With Engineers: Avoiding the Lost In Translation Problem Aaron Massey Assistant Professor of Software Engineering Department of Information Systems University of Maryland, Baltimore County Co-Presenters: Peter Swire and Justin Hemmings 1

Slide 2

Slide 2 text

Challenges for Engineers 1. Engineers want to move fast and break things. – Building an interesting or useful technology is harder than hiring a lawyer. – Law is a problem for lawyers, not engineers. 2. Lawyers unnecessarily complicate simple things. – Many “basic” technologies seem to violate one law or another. 3. Law is like Code, Specifications, or Standards. – Every regulation has a single, correct, absolute meaning. – Ambiguities found in regulatory text are accidental. – Why won’t the lawyer just tell us what the law says? 2

Slide 3

Slide 3 text

Good News #1: Legal Obligations are an Engineering Concern “Violation of a law or regulation may be ethical when that law or rule has inadequate moral basis or when it conflicts with another law judged to be more important.” – ACM Code of Ethics § The ACM, IEEE, and the National Society of Professional Engineers each have written Codes of Ethics to which members must agree. § Engineers need tools to help them understand their legal and ethical obligations. § Blind compliance to the law may be unethical! 3

Slide 4

Slide 4 text

Good News #2: Engineers know they must involve all Stakeholders § Engineers are trained to involve all relevant stakeholders in requirements, design, and testing. § Common Stakeholders include: – The Development team itself • Programmers • Designers • Testers – Customers – Users (sometimes not the same as the Customers) – Regulators § Elicitation techniques capture requirements in an implementable, testable fashion from non-technical users, customers, and other stakeholders. 4

Slide 5

Slide 5 text

Good News #3: Engineers seek to fully understand the problem “We aim to make simple things simple and complex things possible.” – Alan Kay § Miscommunication is the number one reason software projects fail. – Failure to understand stakeholder needs – Failure to communicate stakeholder needs § Engineers know they must fully understand the problem and communicate their understanding to others. § Techniques: elicitation, extraction, requirements disambiguation, modeling, and analysis 5

Slide 6

Slide 6 text

Regulatory Compliance in Software Engineering (RCSE) § RCSE is the application of a systematic approach to building, maintaining, and verifying software that must comply with laws and regulations. § An active area for research! – Identification: Semantic Parameterization (Breaux) – Specification Languages: GRL (Ghanavati), Eddy (Breaux) – Traceability: via Machine Learning (Cleland-Huang) – Threat Modeling: LINDEN (TODO) – De-identification: Differential Privacy (Dwork et al.) – Access Control: many projects, many researchers – Risk Assessment: many projects, many researchers – …and many more. 6

Slide 7

Slide 7 text

Legal Implementation Readiness Study [IEEE Int’l Conf. on Requirements Engineering 2011] § Requirements that meet or exceed their legal obligations are Legally Implementation Ready (LIR). § LIR requirements can be estimated based on the structure of the legal text. 7

Slide 8

Slide 8 text

LIR Research Summary § Big Idea: We can determine whether a system is grossly non-compliant by: 1. Update requirements terms to reflect legal terms 2. Clarifying requirements using the updated terms 3. Tracing requirements to subsections of laws and regulations 4. Evaluating metrics that measure “rough” compliance: • Dependency • Complexity • Maturity § Results: Conceptually–it works! But it is not ready for industry use, too many false positives. 8

Slide 9

Slide 9 text

Legal Ambiguity: a Critical Challenge for Requirements [IEEE Int’l Conf. on Requirements Engineering 2014] § Legal texts are often intentionally ambiguous. – Example: “make reasonable efforts to limit protected health information to the minimum necessary to accomplish the intended purpose of the use” – HIPAA §164.502(b) – The word “reasonable” appears 61 times in HIPAA! § Traditional approaches, such as disambiguation or removal, do not work for legal ambiguities. – Legal texts cannot easily be re-written – Legal stakeholders cannot easily be sought out for definitive clarification. Requirements engineers must interpret ambiguities in legal texts! 9

Slide 10

Slide 10 text

10 170.302(w) 170.302(v) 170.302(u) 170.302(t) 170.302(s) 170.302(r) 170.302(q) 170.302(p) 170.302(o) 170.302(n) 170.302(m) 170.302(l) 170.302(k) 170.302(j) 170.302(i) 170.302(h) 170.302(g) 170.302(f) 170.302(e) 170.302(d) 170.302(c) 170.302(b) 170.302(a) Ambiguities identified in § 170.302 0 10 20 30 40 50 60 Lexical Syntactic Semantic Vagueness Incompleteness Referential Other Ambiguities per Paragraph [IEEE Int’l Conf. on Requirements Engineering 2014]

Slide 11

Slide 11 text

Legal Ambiguity Summary [IEEE Int’l Conf. on Requirements Engineering 2014] § Given 50 minutes and 104 lines of legal text: participants, on average, identified 33.47 ambiguities § 97.5% of all ambiguities identified were classified as one of the six defined types Participants found paragraphs with unintentional ambiguity as implementable! 11

Slide 12

Slide 12 text

Idealized Policy Documents Write Average Consumers Regulators Read Regulate

Slide 13

Slide 13 text

Real Policy Documents Write Average Consumers Regulators Read Regulate Write Average Consumers Regulators Read Regulate Too Complicated!

Slide 14

Slide 14 text

Real Policy Documents Write Average Consumers Regulators Read Regulate Write Average Consumers Regulators Read Regulate Too Complicated! Too Many Policies!

Slide 15

Slide 15 text

© 2006-2014 Aaron Massey et al., Georgia Institute of Technology Privacy Policy Contain Software Requirements [AE04, AEV07] § Privacy Policies contain both privacy protection goals and possible privacy vulnerabilities. § Goals and Vulnerabilities can be expressed in a semi- formal structure using keywords. § Some Examples: – COLLECT date and times at which site was accessed – STORE credit card information until dispute is resolved – ALLOW affiliates to use information for marketing purposes 15

Slide 16

Slide 16 text

© 2006-2014 Aaron Massey et al., Georgia Institute of Technology Using the Topic Model [IEEE Int’l Conf. on Requirements Engineering 2013] § Select a Goal Keyword § Select the topic in which the keyword is most likely present § Select documents in which that topic is most likely present 16 Goal Keywords Topics with Term Documents with Topic ALLOW COLLECT CUSTOMIZE DISCLOSE INFORM 20 68 150 9 125 YouTube Terms of Service, Microsoft Privacy Statement, ConocoPhillips Legal and Privacy Statement { { ■ Select a Goal Keyword ■ Select the topic in which the keyword is most likely present ■ Select documents in which that topic is most likely present

Slide 17

Slide 17 text

© 2006-2014 Aaron Massey et al., Georgia Institute of Technology Finding Requirements in Policy Documents [IEEE Int’l Conf. on Requirements Engineering 2013] 17

Slide 18

Slide 18 text

© 2006-2014 Aaron Massey et al., Georgia Institute of Technology Research Contribution Summary § Demonstrated that entry-level software engineers are ill-prepared to make LIR decisions § Demonstrated that LIR metrics can provide useful guidance for these decisions § Demonstrated ambiguities are prevalent in legal texts and they have implications for software engineers § Demonstrated topic modeling can help to identify software goals and vulnerabilities in policy documents at scale 18

Slide 19

Slide 19 text

Thank you! Questions? Aaron Massey Assistant Professor of Software Engineering Department of Information Systems University of Maryland, Baltimore County http://userpages.umbc.edu/~akmassey/ [email protected] @akmassey Co-Presenters: Peter Swire and Justin Hemmings 19