Footguns and factorisation: how to make users of your cryptographic library successful

Slide 1

Slide 1 text

Footguns and factorisation How to make users of your cryptographic library successful. Lindsay Holmwood @auxesis

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

In 2020, over 300,000 patient records — including detailed consult notes — were leaked and used to extort Vastaamo patients.

Slide 4

Slide 4 text

At around 4 pm, Jere checked Snapchat. An email notiﬁcation popped up on his screen. His hands began to shake. The subject line included his full name, his social security number, and the name of a clinic where he’d gotten mental health treatment as a teenager: Vastaamo. They Told Their Therapists Everything. Hackers Leaked It All William Ralston, Wired

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Vastaamo’s system violated one of the “first principles of cybersecurity”: It didn’t anonymize the records. It didn’t even encrypt them. The only thing protecting patients’ confessions and confidences were a couple of firewalls and a server login screen. They Told Their Therapists Everything. Hackers Leaked It All William Ralston, Wired

Slide 7

Slide 7 text

“I would never do this”

Slide 8

Slide 8 text

20% of submitted code was insecure, but the developers strongly believed it was secure.

Slide 9

Slide 9 text

It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so. Mark Twain

Slide 10

Slide 10 text

Human Factors in Secure Software Development: There is a lot of academic research into cryptography usability and developer behaviour.

Slide 11

Slide 11 text

Let’s take a look at what the research says about: 1. What are the usability traps in cryptography libraries 2. What you can do to help your users avoid them

Slide 12

Slide 12 text

Before we begin…

Slide 13

Slide 13 text

We’re talking about cryptography, not cryptocurrency

Slide 14

Slide 14 text

We’re talking about developers, not end users

Slide 15

Slide 15 text

We’re looking at peer reviewed research, much of it replicated

Slide 16

Slide 16 text

Developers don’t think about security Published 2014

Slide 17

Slide 17 text

Developers shown 6 code examples that contained vulnerabilities covering: ○ TLS setup ○ Time-Of-Check-To-Time-Of-Use ○ Brute force exhaustion ○ Buffer overflows ○ XSS ○ SQL injection The experiment

Slide 18

Slide 18 text

Developers grouped, prompted before each example: The experiment Question Group What is the user input to this program? Control, Priming What happens when this code executes? Control, Priming Could a developer experience unexpected results when running such code? Priming What could be examples of these unexpected results and where do they appear in the code? Priming This code has a vulnerability. Can you pinpoint the problem? Explicit

Slide 19

Slide 19 text

The result? Developers who were primed identified security problems at almost twice the rate of un-primed developers.

Slide 20

Slide 20 text

“security is not part of the heuristics used by developers in their daily programming tasks“ Developers mostly focus on functionality and performance. What it means

Slide 21

Slide 21 text

What it means We think that developers should learn about security, and apply what they learn when they need it. This is not how humans actually think.

Slide 22

Slide 22 text

“we recommend software that interface with the developer (IDEs, text editors, compilers, etc) prime developers on the spot when they need it: while coding” What it means Oliviera et al. (2014)

Slide 23

Slide 23 text

TL;DR, devs don’t think about security, priming increases the likelihood of discovering security bugs

Slide 24

Slide 24 text

Stack Overflow provides functional yet insecure code examples Published 2016

Slide 25

Slide 25 text

Android devs provided with: ○ a skeleton app ○ 4 tasks in random order: 1. Secure Networking 2. Secure Storage 3. Inter-Component Communication 4. Least Permissions ○ 20-30 minutes to complete each task ○ An exit survey Split into 4 groups, could only refer to: ○ Stack Overflow ○ Official documentation ○ Books ○ All of the above The experiment

Slide 26

Slide 26 text

The results? 1. Functional results 2. Secure results

Slide 27

Slide 27 text

Functional results Devs assigned Stack Overflow and books performed best. Devs assigned official docs performed worst. Resources Success Rate Stack Overflow 67.3% Book 66.1% Official docs 40.4% Free choice 51.8%

Slide 28

Slide 28 text

Developers self-assessed their solutions to be secure when they mostly were not. Secure results Task Success Rate Confidence Least Permissions 87% 22% Secure Networking 38% 79% Inter Component Comms 38% 70% Secure Storage 38% 68%

Slide 29

Slide 29 text

Android developers who use Stack Overflow get functional code quicker, but are more likely to produce insecure code. What this means

Slide 30

Slide 30 text

Devs are unlikely to stop using Stack Overflow, because they get functional code. ⬇ What this means Developers mostly focus on functionality and performance.

Slide 31

Slide 31 text

“it is critical to develop documentation and resources that combine the usefulness of forums like Stack Overflow with the security awareness of books or official API documents” What this means Acar et al. (2016)

Slide 32

Slide 32 text

What you can do 1. Regularly spend time looking at Stack Overflow questions about your library 2. Identify gaps in documentation being filled by Stack Overflow 3. Fill the gaps 4. Post links in follow up comments on accepted Stack Overflow answers

Slide 33

Slide 33 text

TL;DR, Stack Overflow answers provide functional, insecure code; developers are bad at assessing whether that code is secure

Slide 34

Slide 34 text

✨ side quest ✨ Published 2021

Slide 35

Slide 35 text

“Our results show that insecure code ends up in the top results and is clicked on more often.” “There is at least a 22.8% chance that one out of the top three Google Search results [for Stack Overflow] leads to insecure code”

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

“Participants that used our modified search engine to look for help online submitted more secure and functional results, with statistical significance”

Slide 38

Slide 38 text

TL;DR, Novel solution to the problem that requires zero behaviour change from developers; can someone in the audience action this?

Slide 39

Slide 39 text

Most “cryptography bugs” come from the use of libraries, not the libraries themselves Published 2014

Slide 40

Slide 40 text

Analysis of 269 “Cryptographic Issues” (CWE310) in CVE database from Jan 2011 to May 2014 The study Categorised vulnerabilities by: Impact ○ Plaintext disclosure ○ Manipulator-In-The-Middle ○ Brute-force ○ Side-channel Layer ○ Primitive (like AES, DH) ○ Protocol (library or abstraction) ○ Application (everything else)

Slide 41

Slide 41 text

The result? were in the app layer } 100% of plaintext disclosures 86% of MITM 79% of brute-force 0% of side-channel

Slide 42

Slide 42 text

○ Expose interfaces in your libraries that are misuse resistant. What you can do

Slide 43

Slide 43 text

83% of cryptographic bugs come from improper uses of cryptographic libraries. What this means

Slide 44

Slide 44 text

Features and documentation are the #1 requirement for devs to use your library securely Published 2017

Slide 45

Slide 45 text

○ Python devs assigned five tasks with stub code: 1. Symmetric: key generation 2. Symmetric: encryption/decryption 3. Asymmetric: key generation 4. Asymmetric: encryption/decryption 5. Symmetric: certificate validation ○ An exit interview — “is your code secure?” Assigned one of five Python crypto libs 1. cryptography.io🤞 2. Keyczar🤞 3. PyNaCl🤞 4. M2Crypto 5. PyCrypto Asked to only use included docs. The experiment

Slide 46

Slide 46 text

Modified Jupyter Notebook to: ○ Snapshot the code ○ Detect and store copy/paste events The experiment

Slide 47

Slide 47 text

The results? 1. Functional results 2. Secure results

Slide 48

Slide 48 text

Wide variation of task success across libraries. Asymmetric tasks less successful. Functional results

Slide 49

Slide 49 text

Copy-paste code 3× more likely to be functional on symmetric tasks. Functional results

Slide 50

Slide 50 text

Python experience, security background, or library experience did not change success. Functional results

Slide 51

Slide 51 text

Asymmetric tasks 3× more likely to produce a secure solution. Secure results

Slide 52

Slide 52 text

Keyczar usage 25× more likely to be secure. But only 10% of submitted Keyczar solutions were functional. Secure results

Slide 53

Slide 53 text

20% of submitted code was insecure, but the developers strongly believed it was secure. Secure results

Slide 54

Slide 54 text

Libraries with simplified interfaces produced more secure results. And yet, security success rate <80%, even on simplified libraries. What it means

Slide 55

Slide 55 text

What it means Peripheral features (like secure key storage & password-based key generation) make-or-break security. You can’t rely on the dev to identify danger, nor find secure alternatives.

Slide 56

Slide 56 text

You can never have too many code examples. If devs can’t find working code examples in your docs, they will go to Stack Overflow and find a footgun. What it means

Slide 57

Slide 57 text

What you can do 1. Create and maintain an overabundance of secure code examples for your library 2. Identify inputs and dependencies your users implicitly rely on, and either: ○ Provide secure wrappers for them ○ Document them exhaustively

Slide 58

Slide 58 text

TL;DR, Misuse resistance helps but it ain’t enough; devs crave code examples — you need to provide them

Slide 59

Slide 59 text

Put runtime warnings into your API for potentially insecure usage Published 2018

Slide 60

Slide 60 text

Python devs assigned: ○ Three tasks with stub code: 1. Symmetric: key generation 2. Symmetric: encryption 3. Key storage ○ Control or patch version of PyCrypto An exit interview — “is your code secure?” Patched PyCrypto? Print to stdout if created object is an insecure cryptography feature. The experiment

Slide 61

Slide 61 text

The results Condition Functional success Secure success Control 90% 27% Patch 86% 51% Devs shown warnings produced secure solutions at nearly twice the rate. Devs self-assessed their solutions as more secure when given the warning. Devs rated control + patched PyCrypto equally usable.

Slide 62

Slide 62 text

The results Devs who wrote code that triggered a warning were 15× more likely to convert it to a secure solution

Slide 63

Slide 63 text

Warnings need these 5 things 1. Title message 2. Colour 3. Source code location 4. Link to external resources 5. Message classification

Slide 64

Slide 64 text

You don’t have to change your API’s shape to increase secure usage. Emitting runtime warnings significantly improves secure usage. What it means

Slide 65

Slide 65 text

“we recommend software that interface with the developer (IDEs, text editors, compilers, etc) prime developers on the spot when they need it: while coding” ⬇ ⬇ ⬇

Slide 66

Slide 66 text

What you can do 1. Make your library emit warnings when you detect potentially insecure usage 2. Provide an obvious mechanism to silence warnings

Slide 67

Slide 67 text

TL;DR, Use runtime warnings to prime developers about insecure usage during dev

Slide 68

Slide 68 text

Takeaways

Slide 69

Slide 69 text

Developers don’t prioritise security (you have to do the hard work for them)

Slide 70

Slide 70 text

You can help developers prioritise by providing: ○ Targeted warnings in your APIs ○ Frictionless, misuse-resistant APIs that work and are secure ○ Copious secure code examples in your documentation

Slide 71

Slide 71 text

Most security bugs happen on the periphery of your cryptography library, inside your users’ applications

Slide 72

Slide 72 text

Your challenge: do the hard work to make it easy for your users

Slide 73

Slide 73 text

Thank you! 🤔 What questions do you have? Like the talk? Let @auxesis know on Twitter. Slides at cipherstash.com/lindsay Come help us build next gen cryptography at CipherStash!

Slide 74

Slide 74 text

References Oliveira, D., Rosenthal, M., Morin, N., Yeh, K. C., Cappos, J., & Zhuang, Y. (2014, December). It's the psychology stupid: how heuristics explain software vulnerabilities and how priming can illuminate developer's blind spots. In Proceedings of the 30th Annual Computer Security Applications Conference (pp. 296-305). [PDF] Acar, Y., Backes, M., Fahl, S., Kim, D., Mazurek, M. L., & Stransky, C. (2016, May). You get where you're looking for: The impact of information sources on code security. In 2016 IEEE Symposium on Security and Privacy (SP) (pp. 289-305). IEEE. [PDF] Fischer, F., Stachelscheid, Y., & Grossklags, J. (2021, November). The Effect of Google Search on Software Security: Unobtrusive Security Interventions via Content Re-ranking. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (pp. 3070-3084). [PDF] Lazar, D., Chen, H., Wang, X., & Zeldovich, N. (2014, June). Why does cryptographic software fail? A case study and open problems. In Proceedings of 5th Asia-Pacific Workshop on Systems (pp. 1-7). [PDF] Acar, Y., Backes, M., Fahl, S., Garfinkel, S., Kim, D., Mazurek, M. L., & Stransky, C. (2017, May). Comparing the usability of cryptographic apis. In 2017 IEEE Symposium on Security and Privacy (SP) (pp. 154-171). IEEE. [PDF] Gorski, P. L., Iacono, L. L., Wermke, D., Stransky, C., Möller, S., Acar, Y., & Fahl, S. (2018). Developers deserve security warnings, too: On the effect of integrated security advice on cryptographic {API} misuse. In Fourteenth Symposium on Usable Privacy and Security ({SOUPS} 2018) (pp. 265-281). [PDF] Gorski, P. L., Acar, Y., Lo Iacono, L., & Fahl, S. (2020, April). Listen to Developers! A Participatory Design Study on Security Warnings for Cryptographic APIs. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-13). [PDF]

Slide 75

Slide 75 text

Further reading ○ API Blindspots: Why Experienced Developers Write Vulnerable Code [PDF] ○ Blindspots in Python and Java APIs Result in Vulnerable Code [PDF] ○ What Do We Really Know about How Habituation to Warnings Occurs Over Time? A Longitudinal fMRI Study of Habituation and Polymorphic Warnings [PDF] ○ I Do and I Understand. Not Yet True for Security APIs. So Sad [PDF] ○ How Usable are Rust Cryptography APIs? [PDF] ○ You Really Shouldn’t Roll Your Own Crypto: An Empirical Study of Vulnerabilities in Cryptographic Libraries [PDF]

Slide 76

Slide 76 text

Appendix

Slide 77

Slide 77 text

Developers who were primed identified security problems at almost twice the rate of un-primed developers. Appendix It’s the Psychology Stupid: How Heuristics Explain Software Vulnerabilities and How Priming Can Illuminate Developer’s Blind Spot

Slide 78

Slide 78 text

Some types of vulnerabilities were more familiar to developers than others. Appendix It’s the Psychology Stupid: How Heuristics Explain Software Vulnerabilities and How Priming Can Illuminate Developer’s Blind Spot

Slide 79

Slide 79 text

Stack Overflow provides functional, insecure code Published 2016

Slide 80

Slide 80 text

A study in two parts 1. Survey developers about what information sources they use 2. Study developers when given security challenges with different information sources

Slide 81

Slide 81 text

1. The Survey ○ most devs use search engines and Stack Overflow ○ a lot also consult the official API documentation ○ a few use books

Slide 82

Slide 82 text

2. The Study Android devs assigned to one of four study groups: 1. Stack Overflow only 2. Official documentation only 3. Books only 4. Free choice

Slide 83

Slide 83 text

Android devs provided with: ○ a skeleton app ○ 4 tasks in random order ○ 20-30 minutes to complete each task ○ An exit interview 2. The Study

Slide 84

Slide 84 text

2. The Study The four tasks: 1. Secure networking 2. Secure storage 3. Inter-Component Communication 4. Least permissions

Slide 85

Slide 85 text

Results? 1. Function 2. Security

Slide 86

Slide 86 text

Results: function "Our results demonstrate that the assigned resource condition had a notable impact on participants’ ability to complete the tasks functionally correctly"

Slide 87

Slide 87 text

Results: function ○ Stack Overflow and book participants performed best ○ Official doc participants performed worst Condition Success Rate SO 67.3% Book 66.1% Official Docs 40.4% Free 51.8%

Slide 88

Slide 88 text

Results: function Self-assessment binning: ○ Confident == strongly agree/agree ○ Not Confident == strongly disagree/disagree/neutral Task Confidence Least Permissions 81.1% Secure Networking 20.1% ICC 40.7% Secure Storage 53.7%

Slide 89

Slide 89 text

Results: security “Our results suggest that choice of resources has the opposite effect on security than it did on functionality”

Slide 90

Slide 90 text

Results: security ○ Official Docs and Book participants performed best ○ Stack Overflow participants performed worst Condition Success Rate SO 51.4% Book 73.0% Official Docs 85.7% Free 65.5%

Slide 91

Slide 91 text

Results: security Self-assessment binning: ○ Confident == strongly agree/agree ○ Not Confident == strongly disagree/disagree/neutral Task Confidence Least Permissions 22.2% Secure Networking 79.6% ICC 70.4% Secure Storage 68.5%

Slide 92

Slide 92 text

What this means ○ “using Stack Overflow helps Android developers to arrive at functional solutions more quickly than with other resources” ○ “Because Stack Overflow contains many insecure answers, Android developers who rely on this resource are likely to create less secure code.” ○ “developers are unlikely to give up using resources that help them quickly address their immediate problems”

Slide 93

Slide 93 text

What this means ○ “it is critical to develop documentation and resources that combine the usefulness of forums like Stack Overflow with the security awareness of books or official API documents” ○ “Stack Overflow could add a mechanism for explicitly rating the security of provided answers and weighting those rated secure more heavily in search results and thread ordering”

Slide 94

Slide 94 text

Most “cryptography bugs” come from the use of libraries, not the libraries themselves Published 2014

Slide 95

Slide 95 text

Slide 96

Slide 96 text

100% of plaintext disclosures were in the app layer 86% of MITM were in the app layer 79% of brute-force were in the app layer 0% of side-channel were in the app layer The result?