Slide 1

Slide 1 text

Randomness in Python: Controlled Chaos in an Ordered Machine by @amandasopkin

Slide 2

Slide 2 text

@amandasopkin

Slide 3

Slide 3 text

Randomness Makes processes secure Mathematically/computationally, biologically, philosophically important Difficult to actually achieve @amandasopkin

Slide 4

Slide 4 text

Why do we need randomness? @amandasopkin

Slide 5

Slide 5 text

4oio342ip4o24p32o

Slide 6

Slide 6 text

4fdslf95454

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Problems with randomness The seed, or starting point The algorithm @amandasopkin

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

1. Determined that user ids were seeded with restart time 2. Crashed the Hacker News site 3. Predicted restart time 4. Predicted assigned user ids as users logged in 5. Impersonated discovered users @amandasopkin

Slide 12

Slide 12 text

DUAL_EC_DRBG Controversy ● 2004: Dual EC PRNG introduced

Slide 13

Slide 13 text

● 08/2007: Shumow and Ferguson present Dual_EC_DRBG flaw at cryptography conference DUAL_EC_DRBG Controversy

Slide 14

Slide 14 text

● 11/2007: Schneier bases article in Wired on their findings DUAL_EC_DRBG Controversy

Slide 15

Slide 15 text

“...would allow NSA to determine the state of the random number generator, and thereby eventually be able to read all data sent over the SSL connection.” DUAL_EC_DRBG Controversy

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

● 09/2013: One of the purposes of Bullrun is described as being "to covertly introduce weaknesses into the encryption standards followed by hardware and software developers around the world." DUAL_EC_DRBG Controversy

Slide 18

Slide 18 text

● NIST recommends removal of the algorithm as a standard DUAL_EC_DRBG Controversy

Slide 19

Slide 19 text

● 2004: Dual EC PRNG introduced ● 08/2007: Shumow and Ferguson present Dual_EC_DRBG flaw at cryptography conference ● 11/2007: Schneier bases article in Wired on their findings DUAL_EC_DRBG Controversy

Slide 20

Slide 20 text

● 09/2013: One of the purposes of Bullrun is described as being "to covertly introduce weaknesses into the encryption standards followed by hardware and software developers around the world." ● 12/2013: Presidential advisory examines encryption standards ● 2014: Standard is removed DUAL_EC_DRBG Controversy

Slide 21

Slide 21 text

Years until standard removed... 10!

Slide 22

Slide 22 text

Who did this impact? Microsoft, Google, Apple, McAfee, Docker, IBM, Oracle, Cisco, VMWare, Juniper, HP, Red Hat, Samsung, Toshiba, DELL, Ruckus, F5 Networks, Lenovo, Nokia, the RSA BSAFE libraries for Java and C++ and more....

Slide 23

Slide 23 text

Ok, so you want to create randomness... @amandasopkin

Slide 24

Slide 24 text

An ideal pseudo random number generator should...

Slide 25

Slide 25 text

1. Pass statistical tests of randomness An ideal pseudo random number generator should... Monobit Distance Poker or Craps Birthday

Slide 26

Slide 26 text

1. Pass statistical tests of randomness 2. Take a long time before repeating An ideal pseudo random number generator should... Have a long “period”

Slide 27

Slide 27 text

1. Pass statistical tests of randomness 2. Take a long time before repeating 3. Execute efficiently An ideal pseudo random number generator should... & Quick Low storage

Slide 28

Slide 28 text

1. Pass statistical tests of randomness 2. Take a long time before repeating 3. Execute efficiently 4. Be repeatable An ideal pseudo random number generator should...

Slide 29

Slide 29 text

1. Pass statistical tests of randomness 2. Take a long time before repeating 3. Execute efficiently 4. Be repeatable 5. Be portable An ideal pseudo random number generator should... Can be run on any machine or system

Slide 30

Slide 30 text

What are the common ways of generating “randomness”? @amandasopkin

Slide 31

Slide 31 text

Linear congruential generators Linear congruential generators take the form xk = (axk−1 + c) (mod M) where x0 is the seed, the integer M is the largest representable integer, and the period is at most M.

Slide 32

Slide 32 text

Linear combination generators a = 3 c = 9 m = 16 xi = 4394 def lcg(): xi = seed() for i in range(10): xi = (a*xi + c)%m print(xi)

Slide 33

Slide 33 text

Linear combination generators Algorithm: xi = (a*xi + c)%m 7 14 3 2 15 6 11 10 7

Slide 34

Slide 34 text

Towards a better pseudorandom generator @amandasopkin

Slide 35

Slide 35 text

Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.

Slide 36

Slide 36 text

Mid square method generally Start with a 4 digit seed Square this value If the result has fewer than 8 digits, add leading 0s Take the middle 4 digits of the result Repeat the sequence

Slide 37

Slide 37 text

Mid square method generally Start with a 4 digit seed 9834

Slide 38

Slide 38 text

Mid square method generally Start with a 4 digit seed Square this value 96707556 9834

Slide 39

Slide 39 text

Mid square method generally Start with a 4 digit seed Square this value If the result has fewer than 8 digits, add leading 0s 96707556 9834 96707556

Slide 40

Slide 40 text

Mid square method generally Start with a 4 digit seed Square this value If the result has fewer than 8 digits, add leading 0s Take the middle 4 digits of the result Start with a 4 digit seed Square this value If the result has fewer than 8 digits, add leading 0s 9834 96707556 96707556 7075

Slide 41

Slide 41 text

Mid square method generally Start with a 4 digit seed Square this value If the result has fewer than 8 digits, add leading 0s Take the middle 4 digits of the result Repeat the sequence Start with a 4 digit seed Square this value If the result has fewer than 8 digits, add leading 0s 9834 96707556 96707556 7075 50055625

Slide 42

Slide 42 text

Mid square method seed_number = int(input("Please enter a four digit number:\n[####] ")) number = seed_number already_seen = set() counter = 0 while number not in already_seen: counter += 1 already_seen.add(number) number = int(str(number * number).zfill(8)[2:6]) print(f"#{counter}: {number}") print(f"We began with the seed {seed_number}, and" f" we repeated ourselves after {counter} steps" f" with {number}.")

Slide 43

Slide 43 text

Mid square method Please enter a four digit number: [####] 5859 #1: 3278 #2: 7452 #3: 5323 #4: 3343 #5: 1756 #6: 835 #7: 6972 #8: 6087 #9: 515 #10: 2652 ....... #59: 24 #60: 5 #61: 0 #62: 0 We began with the seed 5859, and we repeated ourselves after 62 steps with 0.

Slide 44

Slide 44 text

Issues with mid square method Relatively slow Statistically unsatisfactory Sample of random numbers may be too short

Slide 45

Slide 45 text

Predicting the mid square method Advanced LCG Mid square method

Slide 46

Slide 46 text

Let’s talk cryptography @amandasopkin

Slide 47

Slide 47 text

Most used pseudo random number generator Very long period (the Mersenne prime: 219937 − 1) Not cryptographically secure The Mersenne Twister

Slide 48

Slide 48 text

Predicting the random() module from random import random import matplotlib.pyplot as plt def uni(n, m, a, c, seed): sequence = [] Xn = seed for i in range(n): Xn = ((a*Xn + c) % m) sequence.append(Xn/float(m-1)) return(sequence) x = range(1000) y_1 = uni(1000, 2**32, 11695477, 1, datetime.now().microsecond) y_2 = [random() for i in range(1000)] plt.plot(x, y_1, "o", color="blue") plt.show() plt.plot(x, y_2, "o", color="red") plt.show()

Slide 49

Slide 49 text

Predicting the random() module Advanced LCG Built in Random PRNG

Slide 50

Slide 50 text

Whats wrong with the random module? @amandasopkin

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

Problems with the random module...

Slide 53

Slide 53 text

Problems with the random module...

Slide 54

Slide 54 text

Problems with the random module... ...

Slide 55

Slide 55 text

Introducing...the secrets module! @amandasopkin

Slide 56

Slide 56 text

The Secrets module Is cryptographically secure Includes ready made “batteries” for Users that don’t want to build their own Uses 32 bytes of entropy by default

Slide 57

Slide 57 text

A note on entropy... @amandasopkin

Slide 58

Slide 58 text

@amandasopkin Natural sources of entropy

Slide 59

Slide 59 text

Source code of Secrets module from random import SystemRandom _sysrand = SystemRandom() randbits = _sysrand.getrandbits choice = _sysrand.choice def randbelow(exclusive_upper_bound): return _sysrand._randbelow(exclusive_upper_bound) DEFAULT_ENTROPY = 32 # number of bytes to return by default def token_bytes(nbytes=None): if nbytes is None: nbytes = DEFAULT_ENTROPY return os.urandom(nbytes) def token_hex(nbytes=None): return binascii.hexlify(token_bytes(nbytes)).decode('ascii') def token_urlsafe(nbytes=None): tok = token_bytes(nbytes) return base64.urlsafe_b64encode(tok).rstrip(b'=').decode('ascii')

Slide 60

Slide 60 text

SystemRandom Uses OS as a source of randomness Not available on all systems Does not rely on software states Sequences are not repeatable

Slide 61

Slide 61 text

/dev/random Will block without sufficient entropy Relies on “the kernel entropy pool” Slower than /dev/urandom

Slide 62

Slide 62 text

/dev/urandom Will not block without sufficient entropy Relies on “the kernel entropy pool” Faster than /dev/random Theoretically vulnerable to attack

Slide 63

Slide 63 text

Using the secrets module to get tokens import secrets token1 = secrets.token_hex(16) token2 = secrets.token_hex(10) print(token1) print(token2) d2bdc979d5ecec0dccf67854459c5284 584d93ac921d3c74be9c

Slide 64

Slide 64 text

Using the secrets module for password generation import secrets import string alphabet = string.ascii_letters + string.digits password = ''.join(secrets.choice(alphabet) for i in range(10)) print(password) i3OFMKPr8q

Slide 65

Slide 65 text

The secrets module: not the end all be all. @amandasopkin

Slide 66

Slide 66 text

Python’s “nuclear reactor” of Randomness "...folks really are better off learning to use things like cryptography.io for security sensitive software, so this change is just about harm mitigation given that it's inevitable that a non-trivial proportion of the millions of current and future Python developers won't do that."

Slide 67

Slide 67 text

Cryptography.io

Slide 68

Slide 68 text

Let’s wrap up... @amandasopkin

Slide 69

Slide 69 text

Is very important for security Difficult to truly achieve Can be simulated Randomness... @amandasopkin

Slide 70

Slide 70 text

Thank you! @amandasopkin

Slide 71

Slide 71 text

Sources: ● Icons taken from flaticon.com ● https://crypto.stackexchange.com/questions/51232/using- 32-hexadecimal-digits-vs-ascii-equivalent-16-character- password ● https://dev.to/walker/pseudo-random-numbers-in-python-f rom-arithmetic-to-probability-distributions ● Wired Magazine ● The Washington Post ● NYT ● Dilbert