Privacy + Security Forum: When Lawyers Talk With Engineers
These are the slides from my introductory remarks on a panel at the 2015 Privacy + Security Forum on avoiding translation problems between lawyers and engineers.
Problem Aaron Massey Assistant Professor of Software Engineering Department of Information Systems University of Maryland, Baltimore County Co-Presenters: Peter Swire and Justin Hemmings 1
break things. – Building an interesting or useful technology is harder than hiring a lawyer. – Law is a problem for lawyers, not engineers. 2. Lawyers unnecessarily complicate simple things. – Many “basic” technologies seem to violate one law or another. 3. Law is like Code, Specifications, or Standards. – Every regulation has a single, correct, absolute meaning. – Ambiguities found in regulatory text are accidental. – Why won’t the lawyer just tell us what the law says? 2
of a law or regulation may be ethical when that law or rule has inadequate moral basis or when it conflicts with another law judged to be more important.” – ACM Code of Ethics § The ACM, IEEE, and the National Society of Professional Engineers each have written Codes of Ethics to which members must agree. § Engineers need tools to help them understand their legal and ethical obligations. § Blind compliance to the law may be unethical! 3
§ Engineers are trained to involve all relevant stakeholders in requirements, design, and testing. § Common Stakeholders include: – The Development team itself • Programmers • Designers • Testers – Customers – Users (sometimes not the same as the Customers) – Regulators § Elicitation techniques capture requirements in an implementable, testable fashion from non-technical users, customers, and other stakeholders. 4
“We aim to make simple things simple and complex things possible.” – Alan Kay § Miscommunication is the number one reason software projects fail. – Failure to understand stakeholder needs – Failure to communicate stakeholder needs § Engineers know they must fully understand the problem and communicate their understanding to others. § Techniques: elicitation, extraction, requirements disambiguation, modeling, and analysis 5
application of a systematic approach to building, maintaining, and verifying software that must comply with laws and regulations. § An active area for research! – Identification: Semantic Parameterization (Breaux) – Specification Languages: GRL (Ghanavati), Eddy (Breaux) – Traceability: via Machine Learning (Cleland-Huang) – Threat Modeling: LINDEN (TODO) – De-identification: Differential Privacy (Dwork et al.) – Access Control: many projects, many researchers – Risk Assessment: many projects, many researchers – …and many more. 6
2011] § Requirements that meet or exceed their legal obligations are Legally Implementation Ready (LIR). § LIR requirements can be estimated based on the structure of the legal text. 7
a system is grossly non-compliant by: 1. Update requirements terms to reflect legal terms 2. Clarifying requirements using the updated terms 3. Tracing requirements to subsections of laws and regulations 4. Evaluating metrics that measure “rough” compliance: • Dependency • Complexity • Maturity § Results: Conceptually–it works! But it is not ready for industry use, too many false positives. 8
on Requirements Engineering 2014] § Legal texts are often intentionally ambiguous. – Example: “make reasonable efforts to limit protected health information to the minimum necessary to accomplish the intended purpose of the use” – HIPAA §164.502(b) – The word “reasonable” appears 61 times in HIPAA! § Traditional approaches, such as disambiguation or removal, do not work for legal ambiguities. – Legal texts cannot easily be re-written – Legal stakeholders cannot easily be sought out for definitive clarification. Requirements engineers must interpret ambiguities in legal texts! 9
§ Given 50 minutes and 104 lines of legal text: participants, on average, identified 33.47 ambiguities § 97.5% of all ambiguities identified were classified as one of the six defined types Participants found paragraphs with unintentional ambiguity as implementable! 11
Privacy Policy Contain Software Requirements [AE04, AEV07] § Privacy Policies contain both privacy protection goals and possible privacy vulnerabilities. § Goals and Vulnerabilities can be expressed in a semi- formal structure using keywords. § Some Examples: – COLLECT date and times at which site was accessed – STORE credit card information until dispute is resolved – ALLOW affiliates to use information for marketing purposes 15
Using the Topic Model [IEEE Int’l Conf. on Requirements Engineering 2013] § Select a Goal Keyword § Select the topic in which the keyword is most likely present § Select documents in which that topic is most likely present 16 Goal Keywords Topics with Term Documents with Topic ALLOW COLLECT CUSTOMIZE DISCLOSE INFORM 20 68 150 9 125 YouTube Terms of Service, Microsoft Privacy Statement, ConocoPhillips Legal and Privacy Statement { { ▪ Select a Goal Keyword ▪ Select the topic in which the keyword is most likely present ▪ Select documents in which that topic is most likely present
Research Contribution Summary § Demonstrated that entry-level software engineers are ill-prepared to make LIR decisions § Demonstrated that LIR metrics can provide useful guidance for these decisions § Demonstrated ambiguities are prevalent in legal texts and they have implications for software engineers § Demonstrated topic modeling can help to identify software goals and vulnerabilities in policy documents at scale 18
Department of Information Systems University of Maryland, Baltimore County http://userpages.umbc.edu/~akmassey/ [email protected] @akmassey Co-Presenters: Peter Swire and Justin Hemmings 19