Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Not screwing up Encryption as a Developer

Not screwing up Encryption as a Developer

Strict transport security, certificate pinning, perfect forward secrecy, initialization vectors, DKIM, HMAC, TLS, AES, CA, OMG. What on earth are all those convoluted encryption terms and does it really matter to me as a developer?

This talk will present in plain English some of the most important and most mis-understood cryptographic concepts in our industry and dive into the areas most critical for you as a developer.

Keeping data secure through cryptography is an integral part of our lives as developers, and it turns out understanding encryption isn't as cryptic as you'd expect!

Simon Sturmer
PRO

July 19, 2019
Tweet

More Decks by Simon Sturmer

Other Decks in Technology

Transcript

  1. Not screwing up
    Encryption
    as a Developer

    View Slide

  2. I'm
    Simon
    I'm a developer. I like building
    stuff and learning stuff.
    Find me on Twitter: @sstur_

    View Slide

  3. Encryption
    Simply put, encryption is the process of encoding information in
    such a way that only authorized parties can access it.
    Encryption is part of the broader topic of cryptography. This talk
    is about how cryptography relates to you as a developer.

    View Slide

  4. Why would a developer need
    to understand this?
    Software is about processing information.
    People and businesses trust your software with their information.
    In many ways, it falls on you as a software developer to
    understand how to keep information safe.

    View Slide

  5. Encryption is one important way to keep
    information safe from attackers and
    unauthorized parties.

    View Slide

  6. ● Communicating without
    letting unauthorized parties
    read it.
    ● Verifying the integrity of a
    message or file — being sure
    it has not been modified.
    ● Storing data in a way that
    only you can later read it.
    Uses of Cryptography
    ● Validating that the party you
    are communicating with is
    indeed who they claim to be.
    ● Saving passwords in a way
    that cannot be reversed to
    see the original password
    ● Generating unique IDs, such
    as in Git (...)

    View Slide

  7. Security Fail
    There are many ways, as developers, we fail at security, such as:
    ● forgetting to password protect a database
    ● exposing private data through a web interface or API
    ● writing software with vulnerabilities that allow an attacker to
    run code on your server
    ● publishing sensitive information to a public repository

    View Slide

  8. Crypto Fail
    One category of ways developers fail at security is around
    cryptography.
    ● Sending information unencrypted, through insecure channels
    ● Using weak encryption
    ● Not verifying authenticity of a remote party
    ● Storing sensitive info like passwords in insecure ways

    View Slide

  9. Let's talk
    In order to understand how to do this correctly as a developer, we
    should first talk about:
    1. What problems we are trying to solve
    2. The different kinds of encryption
    3. How they work together to solve such problems

    View Slide

  10. Cryptography can help us
    ● Securely communicate with someone over an insecure
    channel
    ● Securely storing data so that only us (or someone we trust)
    can retrieve it
    ● Trust: Verify that an unknown party is who they claim to be
    ● Data integrity: Verify that some data sent from that party has
    not been modified by someone
    ● Key derivation: Creating an encryption key from a text such as
    a password

    View Slide

  11. Types of cryptography
    There are 3 important types of cryptography we’ll explore today:
    ● Hashing
    ● Symmetric Encryption
    ● Asymmetric Encryption, including:
    ○ key exchange
    ○ public key cryptography

    View Slide

  12. Hashing

    View Slide

  13. Hashing
    ● Deterministic: The same
    input will always generate the
    same output
    ● The input cannot be
    determined from the input. It
    is “one way” and as such,
    information is lost in the
    process, making it impossible
    to reconstruct the original.
    A hash is a sort of one-way
    message digest. It takes some
    data of arbitrary length as input
    (e.g. a password, or an entire file)
    and generates a fixed length
    output (the digest or hash) that
    “represents” the input in a way
    that follows these two principles..

    View Slide

  14. Collisions
    Since the output (hash) is a fixed size which may be fewer bytes
    than the input, mathematically it cannot represent as much data,
    meaning that there are less possible hashes than there are
    different inputs. Thus, it’s possible that two different inputs result
    in the same hash value. This is called a collision.

    View Slide

  15. Avoiding Collisions
    A good hash algorithm minimizes the probability of collisions
    occurring in the wild.
    Take the SHA-256 hashing algorithm for example (the one used in
    Bitcoin mining). No two inputs have ever been discovered that
    produce the same output. The possibility of finding one in our
    lifetime is incredibly small.

    View Slide

  16. Hashing is often used as a “checksum” which can verify the
    integrity of some data, for example to make sure a network error
    didn’t cause data corruption during transmission.
    Uses of Hashing

    View Slide

  17. Uses of Hashing
    Another use case is to sign a message using a secret that only you
    know. If you later receive that message back (e.g. session cookie)
    then you know that it was not modified.
    One way to do this is using HMAC — Hash-based Message
    Authentication code.

    View Slide

  18. HMAC

    View Slide

  19. HMAC
    If you and I both know the secret, then I can send you a signed
    message and you can verify that it was indeed from me, and not
    modified by someone intercepting the message.
    But why is it hashed twice?

    View Slide

  20. There’s a weakness of some hashes that allows a length extension
    attack.
    When a Merkle–Damgård based hash is misused as a message authentication
    code with construction H(secret + message), and message and the length of
    secret is known, a length extension attack allows anyone to include extra
    information at the end of the message and produce a valid hash without
    knowing the secret.
    Length Extension Attack

    View Slide

  21. But regardless of good hashing practice, this
    does not hide the contents of the message from
    anyone.
    For that we need to use encryption.

    View Slide

  22. Symmetric
    Encryption

    View Slide

  23. Symmetric Encryption
    Going far back in history, it's common to need to encode a
    message so that only the intended recipient can understand it.
    In the very old days, the "method of encryption" was the secret.
    In modern encryption, the method of encryption (cipher) is public
    but the key is secret.

    View Slide

  24. Substitution Cipher
    The simplest form of using a key to encrypt a message is a
    "substitution cipher" in which each letter is replaced by a different
    letter.
    In such a system, the "key" is the list of substitutions, so the
    recipient can swap back in the original letters.

    View Slide

  25. Substitution Cipher

    View Slide

  26. Substitution Cipher
    However, this is easily broken, even without a computer.
    Anyone know how?

    View Slide

  27. View Slide

  28. A Stronger Substitution Cipher
    One way to make that stronger is to change the character
    mapping after each character processed.
    It would be changed according to some predetermined method.

    View Slide

  29. This is how the Enigma machine
    worked in WW2.
    The way in which the mapping
    changed each keypress was
    effectively the key.
    Enigma

    View Slide

  30. Distributing
    they Key
    These keys were distributed
    manually, on paper and the
    machine needed to be
    re-configured with each new
    key.

    View Slide

  31. This brings us to the next important part of
    encryption, key exchange.

    View Slide

  32. Key Exchange
    In symmetric ciphers, like the Enigma, or even modern day ones
    such as AES, both parties need to have a secret key.
    Getting that key from one person to another was historically done
    physically. This can present a huge challenge — More on this later.

    View Slide

  33. Modern symmetric algorithms, including AES, encrypt data in
    blocks.
    Data is divided up into equal size blocks (the block size of the
    algorithm), padded if necessary, and each block is encrypted
    individually.
    Blocks of Data

    View Slide

  34. If you simply use the key to encrypt each block, that simple
    approach is “ECB” which refers to the block mode of the
    algorithm.
    The problem is that encryption is deterministic for a given key,
    meaning that two identical input blocks will result in identical
    output blocks ... over many blocks, a pattern can emerge.
    Block Mode: ECB

    View Slide

  35. View Slide

  36. The solution is to use a block mode other than
    the naïve “ECB” mode.

    View Slide

  37. Block Mode: CBC
    CBC mode, for example, uses a randomly generated IV —
    initialization vector — as a sort of salt to obfuscate the first block
    before encryption. Then each subsequent block derives its “salt”
    from the encrypted representation of the previous block.

    View Slide

  38. View Slide

  39. Block Mode: CBC
    Thus, two messages with the same content will result in different
    output, assuming the IV is different each time.
    The IV itself is not secret, and in fact is necessary to be provided
    along with the encrypted data since it’s necessary for decryption.

    View Slide

  40. View Slide

  41. Asymmetric
    Encryption

    View Slide

  42. Asymmetric Encryption
    Prior to the 1970s the only way to use encryption to communicate
    with someone was if you both had a shared secret; some
    encryption key that you both know but no one else knows.
    Securely getting a shared secret to someone presented many
    challenges because it could not be done electronically.

    View Slide

  43. But then, in the 1970’s we got
    Public Key Cryptography
    in two important forms

    View Slide

  44. Whitfield Diffie and Martin Hellman published a
    concept in 1976, of negotiating a secret key over
    an insecure channel.
    This is a fascinating method known as
    Diffie Hellman key exchange
    and is used to this day.

    View Slide

  45. Ron Rivest, Adi Shamir, and Leonard Adleman at
    MIT came up with what is now known as RSA, a
    method of asymmetric encryption involving a
    public key and a private key.
    This is used to this day for many things including
    TLS (HTTPS).

    View Slide

  46. Diffie Hellman
    Let's start with Diffie Hellman, a method for secure key exchange.
    The idea is that two parties can communicate in public, and yet
    still end up with a shared secret that no one else can guess, even
    if someone is listening in to the whole exchange.

    View Slide

  47. View Slide

  48. Diffie Hellman
    To extend the paint analogy a little more, it’s based on the
    principle that it’s easy to mix two colors together, but difficult to
    determine what two colors went into the mixture.
    This is a form of a "trapdoor" function — a mathematical problem
    that is hard to solve, but given a proposed solution, is easy to
    verify. Such as a combination lock.

    View Slide

  49. Most asynchronous encryption uses some form
    of prime number factorization as the trapdoor
    function.
    It’s easy to multiply two numbers together, but
    hard to determine which two numbers were
    multiplied together, given only the product.

    View Slide

  50. Use case: Diffie Hellman
    For example, the Diffie Hellman algorithm is used by SSH.
    At the beginning of the connection, the two computers establish a
    shared secret using DH and then use that to derive the key used
    in symmetric encryption for the rest of the session.

    View Slide

  51. Public Key Encryption
    RSA is the oldest and most commonly used form of public key
    encryption — a type of asynchronous encryption that uses a
    public + private key pair.
    Unlike symmetric encryption, there are two keys. Anything
    encrypted by the public key can only be decrypted by the private
    key, and vice versa.

    View Slide

  52. Use Cases
    This allows you to send a message the only the receiver can
    decrypt.
    There are several interesting uses, we'll focus on two:
    ● Establishing a shared secret
    ● Signing a message for the public

    View Slide

  53. 1. Key Exchange
    Establishing a shared secret between two parties, over a public
    communication channel, just like Diffie Hellman.
    For example, if I know your public key, I can generate a random
    secret, encrypt it with your public key and send it to you.
    You can then decrypt it and we have a shared secret. The
    advantage is that it didn’t require the sort of chatty,
    back-and-forward communication that Diffie Hellman requires.

    View Slide

  54. The disadvantage of this is that it does not
    provide forward secrecy, something we’ll talk
    about more soon.

    View Slide

  55. Signing a message with RSA
    Imagine, I want to send you a file, and I want to include a way for
    you to know that it hasn’t been modified.
    I can hash the file and provide that hash. But that alone doesn’t
    really prevent intentional modification by a third party, because
    the adversary can generate a hash too.

    View Slide

  56. 2. Signing a message
    I would actually encrypt that hash value using my private key.
    Assuming you know my public key, you can decrypt my hash,
    compute your own hash of the file and verify the two hashes
    match. This effectively proves that the file came from me (or
    someone with my private key) and has not been modified.

    View Slide

  57. The hash encrypted with my private key is
    effectively the signature.
    But you need to know my public key for this to
    work.

    View Slide

  58. Getting the Public Key
    You could get it separately, from a trusted third party, but then we
    have a new problem of securely distributing everyone’s public
    key.
    This isn’t really feasible at web scale.

    View Slide

  59. The way that it's actually done is that I send you
    the public key along with the file and the
    signature.

    View Slide

  60. Anyone could send you a file with a signature and
    a public key, but only I can send you a document
    with a signature that was generated from
    my key pair.

    View Slide

  61. So how do you know it’s actually my public key?

    View Slide

  62. You need some way to trust that sender is who
    they say they are.
    This is essentially a new problem, one of identity
    and trust.

    View Slide

  63. Certificate Pinning
    If the two computers are both controlled by you, such as servers
    in different data centers, and you want to be sure no one
    intercepts the communication, you can use “certificate pinning”
    which is just pre-loading the “certificate” of the other machine on
    each machine.

    View Slide

  64. Of course, that doesn’t work for the public, who
    don’t have a database of all the valid certificates
    of every server on the internet.

    View Slide

  65. 3. Chain of trust
    The most common approach to this is to use PKI — public key
    infrastructure — to establish a chain of trust.
    This is an important part of SSL (more accurately TLS) used in
    HTTPS.

    View Slide

  66. View Slide

  67. Public Key Infrastructure
    We establish a chain of trust back to a well-known trusted
    authority, a CA — certificate authority. In the case of the web,
    there are a set of trusted root CAs. These are entities that are
    globally recognized and widely trusted, such as letsencrypt.org.
    The certificate (including public key) of each root CA is
    pre-installed in your browser or operating system.

    View Slide

  68. View Slide

  69. For this example, let’s say you are communicating with a server on
    the internet, a server which claims to be that of mybank.com. The
    server will send you their public key along with some information
    (such as company name), collectively their certificate, which will be
    “signed” by a root certificate authority.
    Example of this process

    View Slide

  70. If the CA that signed the server’s certificate is in
    your list of root CAs on your computer, then you
    have the public key on file and you can validate
    that signature, effectively validating that the
    server is who it claims to be.

    View Slide

  71. RSA vs Diffie Hellman
    So as you can see above, RSA is used to solve the problem of
    identity, something that Diffie Hellman cannot do.
    However, DH can do something important that RSA also cannot
    do, and that brings us to Forward Secrecy.

    View Slide

  72. The Problem
    When you go to a website that uses HTTPS, your browser will
    receive and validate the server’s public key (part of their
    Certificate) and then generate a session key. It will encrypt that
    key with the server’s public key, using RSA, and send it to the
    server. That’s how the two computers do key exchange. The rest
    of the session uses standard symmetric encryption based on that
    session key.

    View Slide

  73. Now remember, an adversary can listen to and record
    this entire communication, because the internet is a
    public network, but they can’t decrypt the
    conversation, so it’s meaningless, right?

    View Slide

  74. What if, some time later, the adversary is able to
    breach the server and gain access to the private key?
    There are many ways this could happen.

    View Slide

  75. Now, every session that was recorded, from every
    previous communication can be completely
    decrypted. It immediately unlocks all past secrets.

    View Slide

  76. The Solution
    Remember back to Diffie Hellman.
    if the two parties had been using that method of key exchange,
    then the secret key NEVER goes across the wire. It’s impossible to
    determine later.

    View Slide

  77. Perfect Forward Secrecy
    This is the principle behind PFS — perfect forward secrecy.
    We still use RSA to verify the SSL certificates including public key,
    but we use DH for the key exchange. This essentially provides the
    best of both worlds.

    View Slide

  78. But you need to enable PFS on your server.

    View Slide

  79. Speaking of things you need to enable on your
    server, let's talk about HTTP Strict Transport
    Security — HSTS.

    View Slide

  80. The Problem
    This is based on the fact that a large portion of your users are
    going to type www.mybank.com into their browser’s address bar,
    without explicitly typing “https://”.
    The browser will default to insecure “http:” and then, hopefully
    you’ve setup your web server to notice this and immediately issue
    a redirect to send the browser to the secure version.

    View Slide

  81. The Problem
    However, what if that initial insecure request was intercepted by a
    malicious party? They can send their own spoofed response,
    directing the user to https://not-my-bank.com which might trick
    the user into entering their password.
    Can we use encryption to solve this problem?

    View Slide

  82. Strict Transport Security
    Well actually this solution just uses a simple HTTP header
    “Strict-Transport-Security” which tells the web browser to never
    take the user to the insecure version and always go directly to the
    “https:” version of the site.
    This is a simple thing you can do as a developer or server admin
    to make everyone more secure.

    View Slide

  83. OK, but what about the DNS request, that’s still
    the weak link in the chain, right?

    View Slide

  84. DNS
    DNS requests go over completely unencrypted channels. Any
    network-level attacker can spoof a name request and send back
    an IP address to a malicious server instead.
    Since DNS is so insecure, this is also used to censor the internet.
    In the UK for example, ISPs are required to block certain websites
    at the DNS level.

    View Slide

  85. The way to solve this, of course, is with
    encryption!

    View Slide

  86. DNS over HTTPS
    There is a protocol called DNS over HTTPS or DoH which will
    make sure that all name lookups happen across a secure channel.
    The good folks at Mozilla, Cloudflare and others are working to
    bring this to you this year.

    View Slide

  87. It first appeared in Firefox nightly and is expected to land in
    Firefox stable soon. Other browser makers, including Chrome, are
    putting it on their roadmap too.

    View Slide

  88. But we still have so much left to encrypt.
    One example is email, importantly, verifying the
    sender of an email.

    View Slide

  89. DKIM: DomainKeys Identified Mail
    This is a way to set a public key on the DNS for your domain,
    saying that any email sender that claims to be sending from
    [email protected]” needs to sign the email with the
    private key that matches this public key.

    View Slide

  90. This is something you can and should do today.
    Almost every major email provider supports this.

    View Slide

  91. Importantly, DKIM does not encrypt the body of
    the message, but it does verify the integrity of the
    sender and that’s a good start to getting strong
    encryption everywhere.

    View Slide

  92. So what are common mistakes that
    developers make?

    View Slide

  93. ● Sending information unencrypted, through insecure channels
    ● Passwords saved with reversible encryption
    ● Passwords hashed without using salt
    ● Using weak ciphers (encryption algorithms) or hashes
    ○ In 2006, using SHA-1 was perfectly acceptable, today it can be
    cracked easily.
    ● Using a strong cipher but with a leaky “block mode”
    Common Mistakes

    View Slide

  94. ● Storing the key with the data
    ● Putting the key in the source code or config file
    ● Not verifying authenticity of a remote party
    ● Poor random number generation
    ● Creating a hash signature that is vulnerable to a length extension
    attack
    ● "I only have a simple blog site"
    ● 1024-bit RSA keys
    Common Mistakes

    View Slide

  95. There's a lot you can do.
    Understanding the underpinnings of encryption.
    Stay up to date on what is considered to be weak or strong in
    terms of cryptography.
    Think like an attacker. Where is the weak link in your encryption?

    View Slide

  96. Cryptography is a fun and fascinating field.
    I hope you learned something.

    View Slide

  97. Thanks
    Find me on Twitter: @sstur_

    View Slide