Not screwing up Encryption as a Developer

I'm Simon I'm a developer. I like building stuﬀ and
learning stuﬀ. Find me on Twitter: @sstur_

Encryption Simply put, encryption is the process of encoding information
in such a way that only authorized parties can access it. Encryption is part of the broader topic of cryptography. This talk is about how cryptography relates to you as a developer.

Why would a developer need to understand this? Software is
about processing information. People and businesses trust your software with their information. In many ways, it falls on you as a software developer to understand how to keep information safe.

Encryption is one important way to keep information safe from
attackers and unauthorized parties.

• Communicating without letting unauthorized parties read it. • Verifying
the integrity of a message or ﬁle — being sure it has not been modiﬁed. • Storing data in a way that only you can later read it. Uses of Cryptography • Validating that the party you are communicating with is indeed who they claim to be. • Saving passwords in a way that cannot be reversed to see the original password • Generating unique IDs, such as in Git (...)

Security Fail There are many ways, as developers, we fail
at security, such as: • forgetting to password protect a database • exposing private data through a web interface or API • writing software with vulnerabilities that allow an attacker to run code on your server • publishing sensitive information to a public repository

Crypto Fail One category of ways developers fail at security
is around cryptography. • Sending information unencrypted, through insecure channels • Using weak encryption • Not verifying authenticity of a remote party • Storing sensitive info like passwords in insecure ways

Let's talk In order to understand how to do this
correctly as a developer, we should ﬁrst talk about: 1. What problems we are trying to solve 2. The diﬀerent kinds of encryption 3. How they work together to solve such problems

Cryptography can help us • Securely communicate with someone over
an insecure channel • Securely storing data so that only us (or someone we trust) can retrieve it • Trust: Verify that an unknown party is who they claim to be • Data integrity: Verify that some data sent from that party has not been modiﬁed by someone • Key derivation: Creating an encryption key from a text such as a password

Types of cryptography There are 3 important types of cryptography
we’ll explore today: • Hashing • Symmetric Encryption • Asymmetric Encryption, including: ◦ key exchange ◦ public key cryptography

Hashing

Hashing • Deterministic: The same input will always generate the
same output • The input cannot be determined from the input. It is “one way” and as such, information is lost in the process, making it impossible to reconstruct the original. A hash is a sort of one-way message digest. It takes some data of arbitrary length as input (e.g. a password, or an entire ﬁle) and generates a ﬁxed length output (the digest or hash) that “represents” the input in a way that follows these two principles..

Collisions Since the output (hash) is a fixed size which
may be fewer bytes than the input, mathematically it cannot represent as much data, meaning that there are less possible hashes than there are different inputs. Thus, it’s possible that two different inputs result in the same hash value. This is called a collision.

Avoiding Collisions A good hash algorithm minimizes the probability of
collisions occurring in the wild. Take the SHA-256 hashing algorithm for example (the one used in Bitcoin mining). No two inputs have ever been discovered that produce the same output. The possibility of ﬁnding one in our lifetime is incredibly small.

Hashing is often used as a “checksum” which can verify
the integrity of some data, for example to make sure a network error didn’t cause data corruption during transmission. Uses of Hashing

Uses of Hashing Another use case is to sign a
message using a secret that only you know. If you later receive that message back (e.g. session cookie) then you know that it was not modiﬁed. One way to do this is using HMAC — Hash-based Message Authentication code.

HMAC If you and I both know the secret, then
I can send you a signed message and you can verify that it was indeed from me, and not modiﬁed by someone intercepting the message. But why is it hashed twice?

There’s a weakness of some hashes that allows a length
extension attack. When a Merkle–Damgård based hash is misused as a message authentication code with construction H(secret + message), and message and the length of secret is known, a length extension attack allows anyone to include extra information at the end of the message and produce a valid hash without knowing the secret. Length Extension Attack

But regardless of good hashing practice, this does not hide
the contents of the message from anyone. For that we need to use encryption.

Symmetric Encryption

Symmetric Encryption Going far back in history, it's common to
need to encode a message so that only the intended recipient can understand it. In the very old days, the "method of encryption" was the secret. In modern encryption, the method of encryption (cipher) is public but the key is secret.

Substitution Cipher The simplest form of using a key to
encrypt a message is a "substitution cipher" in which each letter is replaced by a diﬀerent letter. In such a system, the "key" is the list of substitutions, so the recipient can swap back in the original letters.

Substitution Cipher

Substitution Cipher However, this is easily broken, even without a
computer. Anyone know how?

A Stronger Substitution Cipher One way to make that stronger
is to change the character mapping after each character processed. It would be changed according to some predetermined method.

This is how the Enigma machine worked in WW2. The
way in which the mapping changed each keypress was eﬀectively the key. Enigma

Distributing they Key These keys were distributed manually, on paper
and the machine needed to be re-conﬁgured with each new key.

This brings us to the next important part of encryption,
key exchange.

Key Exchange In symmetric ciphers, like the Enigma, or even
modern day ones such as AES, both parties need to have a secret key. Getting that key from one person to another was historically done physically. This can present a huge challenge — More on this later.

Modern symmetric algorithms, including AES, encrypt data in blocks. Data
is divided up into equal size blocks (the block size of the algorithm), padded if necessary, and each block is encrypted individually. Blocks of Data

If you simply use the key to encrypt each block,
that simple approach is “ECB” which refers to the block mode of the algorithm. The problem is that encryption is deterministic for a given key, meaning that two identical input blocks will result in identical output blocks ... over many blocks, a pattern can emerge. Block Mode: ECB

The solution is to use a block mode other than
the naïve “ECB” mode.

Block Mode: CBC CBC mode, for example, uses a randomly
generated IV — initialization vector — as a sort of salt to obfuscate the ﬁrst block before encryption. Then each subsequent block derives its “salt” from the encrypted representation of the previous block.

Block Mode: CBC Thus, two messages with the same content
will result in diﬀerent output, assuming the IV is diﬀerent each time. The IV itself is not secret, and in fact is necessary to be provided along with the encrypted data since it’s necessary for decryption.

Asymmetric Encryption

Asymmetric Encryption Prior to the 1970s the only way to
use encryption to communicate with someone was if you both had a shared secret; some encryption key that you both know but no one else knows. Securely getting a shared secret to someone presented many challenges because it could not be done electronically.

But then, in the 1970’s we got Public Key Cryptography
in two important forms

Whitfield Diffie and Martin Hellman published a concept in 1976,
of negotiating a secret key over an insecure channel. This is a fascinating method known as Diffie Hellman key exchange and is used to this day.

Ron Rivest, Adi Shamir, and Leonard Adleman at MIT came
up with what is now known as RSA, a method of asymmetric encryption involving a public key and a private key. This is used to this day for many things including TLS (HTTPS).

Diﬃe Hellman Let's start with Diﬃe Hellman, a method for
secure key exchange. The idea is that two parties can communicate in public, and yet still end up with a shared secret that no one else can guess, even if someone is listening in to the whole exchange.

Diﬃe Hellman To extend the paint analogy a little more,
it’s based on the principle that it’s easy to mix two colors together, but diﬃcult to determine what two colors went into the mixture. This is a form of a "trapdoor" function — a mathematical problem that is hard to solve, but given a proposed solution, is easy to verify. Such as a combination lock.

Most asynchronous encryption uses some form of prime number factorization
as the trapdoor function. It’s easy to multiply two numbers together, but hard to determine which two numbers were multiplied together, given only the product.

Use case: Diﬃe Hellman For example, the Diﬃe Hellman algorithm
is used by SSH. At the beginning of the connection, the two computers establish a shared secret using DH and then use that to derive the key used in symmetric encryption for the rest of the session.

Public Key Encryption RSA is the oldest and most commonly
used form of public key encryption — a type of asynchronous encryption that uses a public + private key pair. Unlike symmetric encryption, there are two keys. Anything encrypted by the public key can only be decrypted by the private key, and vice versa.

Use Cases This allows you to send a message the
only the receiver can decrypt. There are several interesting uses, we'll focus on two: • Establishing a shared secret • Signing a message for the public

1. Key Exchange Establishing a shared secret between two parties,
over a public communication channel, just like Diﬃe Hellman. For example, if I know your public key, I can generate a random secret, encrypt it with your public key and send it to you. You can then decrypt it and we have a shared secret. The advantage is that it didn’t require the sort of chatty, back-and-forward communication that Diﬃe Hellman requires.

The disadvantage of this is that it does not provide
forward secrecy, something we’ll talk about more soon.

Signing a message with RSA Imagine, I want to send
you a file, and I want to include a way for you to know that it hasn’t been modified. I can hash the file and provide that hash. But that alone doesn’t really prevent intentional modification by a third party, because the adversary can generate a hash too.

2. Signing a message I would actually encrypt that hash
value using my private key. Assuming you know my public key, you can decrypt my hash, compute your own hash of the file and verify the two hashes match. This effectively proves that the file came from me (or someone with my private key) and has not been modified.

The hash encrypted with my private key is eﬀectively the
signature. But you need to know my public key for this to work.

Getting the Public Key You could get it separately, from
a trusted third party, but then we have a new problem of securely distributing everyone’s public key. This isn’t really feasible at web scale.

The way that it's actually done is that I send
you the public key along with the ﬁle and the signature.

Anyone could send you a ﬁle with a signature and
a public key, but only I can send you a document with a signature that was generated from my key pair.

So how do you know it’s actually my public key?

You need some way to trust that sender is who
they say they are. This is essentially a new problem, one of identity and trust.

Certificate Pinning If the two computers are both controlled by
you, such as servers in different data centers, and you want to be sure no one intercepts the communication, you can use “certificate pinning” which is just pre-loading the “certificate” of the other machine on each machine.

Of course, that doesn’t work for the public, who don’t
have a database of all the valid certiﬁcates of every server on the internet.

3. Chain of trust The most common approach to this
is to use PKI — public key infrastructure — to establish a chain of trust. This is an important part of SSL (more accurately TLS) used in HTTPS.

Public Key Infrastructure We establish a chain of trust back
to a well-known trusted authority, a CA — certiﬁcate authority. In the case of the web, there are a set of trusted root CAs. These are entities that are globally recognized and widely trusted, such as letsencrypt.org. The certiﬁcate (including public key) of each root CA is pre-installed in your browser or operating system.

For this example, let’s say you are communicating with a
server on the internet, a server which claims to be that of mybank.com. The server will send you their public key along with some information (such as company name), collectively their certiﬁcate, which will be “signed” by a root certiﬁcate authority. Example of this process

If the CA that signed the server’s certificate is in
your list of root CAs on your computer, then you have the public key on file and you can validate that signature, effectively validating that the server is who it claims to be.

RSA vs Diﬃe Hellman So as you can see above,
RSA is used to solve the problem of identity, something that Diﬃe Hellman cannot do. However, DH can do something important that RSA also cannot do, and that brings us to Forward Secrecy.

The Problem When you go to a website that uses
HTTPS, your browser will receive and validate the server’s public key (part of their Certiﬁcate) and then generate a session key. It will encrypt that key with the server’s public key, using RSA, and send it to the server. That’s how the two computers do key exchange. The rest of the session uses standard symmetric encryption based on that session key.

Now remember, an adversary can listen to and record this
entire communication, because the internet is a public network, but they can’t decrypt the conversation, so it’s meaningless, right?

What if, some time later, the adversary is able to
breach the server and gain access to the private key? There are many ways this could happen.

Now, every session that was recorded, from every previous communication
can be completely decrypted. It immediately unlocks all past secrets.

The Solution Remember back to Diﬃe Hellman. if the two
parties had been using that method of key exchange, then the secret key NEVER goes across the wire. It’s impossible to determine later.

Perfect Forward Secrecy This is the principle behind PFS —
perfect forward secrecy. We still use RSA to verify the SSL certiﬁcates including public key, but we use DH for the key exchange. This essentially provides the best of both worlds.

But you need to enable PFS on your server.

Speaking of things you need to enable on your server,
let's talk about HTTP Strict Transport Security — HSTS.

The Problem This is based on the fact that a
large portion of your users are going to type www.mybank.com into their browser’s address bar, without explicitly typing “https://”. The browser will default to insecure “http:” and then, hopefully you’ve setup your web server to notice this and immediately issue a redirect to send the browser to the secure version.

The Problem However, what if that initial insecure request was
intercepted by a malicious party? They can send their own spoofed response, directing the user to https://not-my-bank.com which might trick the user into entering their password. Can we use encryption to solve this problem?

Strict Transport Security Well actually this solution just uses a
simple HTTP header “Strict-Transport-Security” which tells the web browser to never take the user to the insecure version and always go directly to the “https:” version of the site. This is a simple thing you can do as a developer or server admin to make everyone more secure.

OK, but what about the DNS request, that’s still the
weak link in the chain, right?

DNS DNS requests go over completely unencrypted channels. Any network-level
attacker can spoof a name request and send back an IP address to a malicious server instead. Since DNS is so insecure, this is also used to censor the internet. In the UK for example, ISPs are required to block certain websites at the DNS level.

The way to solve this, of course, is with encryption!

DNS over HTTPS There is a protocol called DNS over
HTTPS or DoH which will make sure that all name lookups happen across a secure channel. The good folks at Mozilla, Cloudﬂare and others are working to bring this to you this year.

It ﬁrst appeared in Firefox nightly and is expected to
land in Firefox stable soon. Other browser makers, including Chrome, are putting it on their roadmap too.

But we still have so much left to encrypt. One
example is email, importantly, verifying the sender of an email.

DKIM: DomainKeys Identiﬁed Mail This is a way to set
a public key on the DNS for your domain, saying that any email sender that claims to be sending from “[email protected]” needs to sign the email with the private key that matches this public key.

This is something you can and should do today. Almost
every major email provider supports this.

Importantly, DKIM does not encrypt the body of the message,
but it does verify the integrity of the sender and that’s a good start to getting strong encryption everywhere.

So what are common mistakes that developers make?

• Sending information unencrypted, through insecure channels • Passwords saved
with reversible encryption • Passwords hashed without using salt • Using weak ciphers (encryption algorithms) or hashes ◦ In 2006, using SHA-1 was perfectly acceptable, today it can be cracked easily. • Using a strong cipher but with a leaky “block mode” Common Mistakes

• Storing the key with the data • Putting the
key in the source code or conﬁg ﬁle • Not verifying authenticity of a remote party • Poor random number generation • Creating a hash signature that is vulnerable to a length extension attack • "I only have a simple blog site" • 1024-bit RSA keys Common Mistakes

There's a lot you can do. Understanding the underpinnings of
encryption. Stay up to date on what is considered to be weak or strong in terms of cryptography. Think like an attacker. Where is the weak link in your encryption?

Cryptography is a fun and fascinating ﬁeld. I hope you
learned something.

Thanks Find me on Twitter: @sstur_

Not screwing up Encryption as a Developer

Not screwing up Encryption as a Developer

More Decks by Simon Sturmer

Other Decks in Technology

Featured

Transcript