Randomness in Python:
Controlled Chaos in an Ordered
Machine
by @amandasopkin
Slide 2
Slide 2 text
@amandasopkin
Slide 3
Slide 3 text
Randomness
Makes processes secure
Mathematically/computationally,
biologically, philosophically important
Difficult to actually achieve
@amandasopkin
Slide 4
Slide 4 text
Why do we need
randomness?
@amandasopkin
Slide 5
Slide 5 text
4oio342ip4o24p32o
Slide 6
Slide 6 text
4fdslf95454
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
Problems with randomness
The seed, or starting point The algorithm
@amandasopkin
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
1. Determined that user ids were seeded
with restart time
2. Crashed the Hacker News site
3. Predicted restart time
4. Predicted assigned user ids as users
logged in
5. Impersonated discovered users
@amandasopkin
● 08/2007: Shumow and Ferguson present
Dual_EC_DRBG flaw at cryptography conference
DUAL_EC_DRBG Controversy
Slide 14
Slide 14 text
● 11/2007: Schneier bases article in Wired on
their findings
DUAL_EC_DRBG Controversy
Slide 15
Slide 15 text
“...would allow NSA to determine the
state of the random number
generator, and thereby eventually be
able to read all data sent over the
SSL connection.”
DUAL_EC_DRBG Controversy
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
● 09/2013: One of the purposes of Bullrun is
described as being "to covertly introduce
weaknesses into the encryption standards
followed by hardware and software developers
around the world."
DUAL_EC_DRBG Controversy
Slide 18
Slide 18 text
● NIST recommends removal of the algorithm as a
standard
DUAL_EC_DRBG Controversy
Slide 19
Slide 19 text
● 2004: Dual EC PRNG introduced
● 08/2007: Shumow and Ferguson present Dual_EC_DRBG
flaw at cryptography conference
● 11/2007: Schneier bases article in Wired on their
findings
DUAL_EC_DRBG Controversy
Slide 20
Slide 20 text
● 09/2013: One of the purposes of Bullrun is
described as being "to covertly introduce
weaknesses into the encryption standards followed
by hardware and software developers around the
world."
● 12/2013: Presidential advisory examines encryption
standards
● 2014: Standard is removed
DUAL_EC_DRBG Controversy
Slide 21
Slide 21 text
Years until standard removed...
10!
Slide 22
Slide 22 text
Who did this impact?
Microsoft, Google, Apple, McAfee,
Docker, IBM, Oracle, Cisco, VMWare,
Juniper, HP, Red Hat, Samsung,
Toshiba, DELL, Ruckus, F5 Networks,
Lenovo, Nokia, the RSA BSAFE
libraries for Java and C++ and
more....
Slide 23
Slide 23 text
Ok, so you want to
create randomness...
@amandasopkin
Slide 24
Slide 24 text
An ideal pseudo random number generator
should...
Slide 25
Slide 25 text
1. Pass statistical tests of randomness
An ideal pseudo random number generator
should...
Monobit Distance Poker or
Craps
Birthday
Slide 26
Slide 26 text
1. Pass statistical tests of randomness
2. Take a long time before repeating
An ideal pseudo random number generator
should...
Have a long “period”
Slide 27
Slide 27 text
1. Pass statistical tests of randomness
2. Take a long time before repeating
3. Execute efficiently
An ideal pseudo random number generator
should...
&
Quick Low storage
Slide 28
Slide 28 text
1. Pass statistical tests of randomness
2. Take a long time before repeating
3. Execute efficiently
4. Be repeatable
An ideal pseudo random number generator
should...
Slide 29
Slide 29 text
1. Pass statistical tests of randomness
2. Take a long time before repeating
3. Execute efficiently
4. Be repeatable
5. Be portable
An ideal pseudo random number generator
should...
Can be run on any machine or system
Slide 30
Slide 30 text
What are the common
ways of generating
“randomness”?
@amandasopkin
Slide 31
Slide 31 text
Linear congruential generators
Linear congruential generators take the form
xk = (axk−1 + c) (mod M)
where x0 is the seed, the integer M is the
largest representable integer, and the period
is at most M.
Slide 32
Slide 32 text
Linear combination generators
a = 3
c = 9
m = 16
xi = 4394
def lcg():
xi = seed()
for i in range(10):
xi = (a*xi + c)%m
print(xi)
Slide 33
Slide 33 text
Linear combination generators
Algorithm: xi = (a*xi + c)%m
7
14
3
2
15
6
11
10
7
Slide 34
Slide 34 text
Towards a better
pseudorandom generator
@amandasopkin
Slide 35
Slide 35 text
Any one who
considers
arithmetical methods
of producing random
digits is, of
course, in a state
of sin.
Slide 36
Slide 36 text
Mid square method generally
Start with a 4 digit seed
Square this value
If the result has fewer than 8 digits, add
leading 0s
Take the middle 4 digits of the result
Repeat the sequence
Slide 37
Slide 37 text
Mid square method generally
Start with a 4 digit seed 9834
Slide 38
Slide 38 text
Mid square method generally
Start with a 4 digit seed
Square this value 96707556
9834
Slide 39
Slide 39 text
Mid square method generally
Start with a 4 digit seed
Square this value
If the result has fewer than 8
digits, add leading 0s
96707556
9834
96707556
Slide 40
Slide 40 text
Mid square method generally
Start with a 4 digit seed
Square this value
If the result has fewer than 8
digits, add leading 0s
Take the middle 4 digits of the
result
Start with a 4 digit seed
Square this value
If the result has fewer than 8
digits, add leading 0s
9834
96707556
96707556
7075
Slide 41
Slide 41 text
Mid square method generally
Start with a 4 digit seed
Square this value
If the result has fewer than 8
digits, add leading 0s
Take the middle 4 digits of the
result
Repeat the sequence
Start with a 4 digit seed
Square this value
If the result has fewer than 8
digits, add leading 0s
9834
96707556
96707556
7075
50055625
Slide 42
Slide 42 text
Mid square method
seed_number = int(input("Please enter a four digit number:\n[####] "))
number = seed_number
already_seen = set()
counter = 0
while number not in already_seen:
counter += 1
already_seen.add(number)
number = int(str(number * number).zfill(8)[2:6])
print(f"#{counter}: {number}")
print(f"We began with the seed {seed_number}, and"
f" we repeated ourselves after {counter} steps"
f" with {number}.")
Slide 43
Slide 43 text
Mid square method
Please enter a four digit number: [####]
5859
#1: 3278
#2: 7452
#3: 5323
#4: 3343
#5: 1756
#6: 835
#7: 6972
#8: 6087
#9: 515
#10: 2652
.......
#59: 24 #60: 5 #61: 0 #62: 0 We began with the seed 5859, and we repeated ourselves after 62 steps
with 0.
Slide 44
Slide 44 text
Issues with mid square method
Relatively slow
Statistically unsatisfactory
Sample of random numbers may be too short
Slide 45
Slide 45 text
Predicting the mid square method
Advanced LCG Mid square method
Slide 46
Slide 46 text
Let’s talk cryptography
@amandasopkin
Slide 47
Slide 47 text
Most used pseudo random number generator
Very long period (the Mersenne prime: 219937 − 1)
Not cryptographically secure
The Mersenne Twister
Slide 48
Slide 48 text
Predicting the random() module
from random import random
import matplotlib.pyplot as plt
def uni(n, m, a, c, seed):
sequence = []
Xn = seed
for i in range(n):
Xn = ((a*Xn + c) % m)
sequence.append(Xn/float(m-1))
return(sequence)
x = range(1000)
y_1 = uni(1000, 2**32, 11695477, 1, datetime.now().microsecond)
y_2 = [random() for i in range(1000)]
plt.plot(x, y_1, "o", color="blue")
plt.show()
plt.plot(x, y_2, "o", color="red")
plt.show()
Slide 49
Slide 49 text
Predicting the random() module
Advanced LCG Built in Random PRNG
Slide 50
Slide 50 text
Whats wrong with the
random module?
@amandasopkin
Slide 51
Slide 51 text
No content
Slide 52
Slide 52 text
Problems with the random module...
Slide 53
Slide 53 text
Problems with the random module...
Slide 54
Slide 54 text
Problems with the random module...
...
Slide 55
Slide 55 text
Introducing...the
secrets module!
@amandasopkin
Slide 56
Slide 56 text
The Secrets module
Is cryptographically secure
Includes ready made “batteries” for
Users that don’t want to build their own
Uses 32 bytes of entropy by default
Slide 57
Slide 57 text
A note on entropy...
@amandasopkin
Slide 58
Slide 58 text
@amandasopkin
Natural sources of entropy
Slide 59
Slide 59 text
Source code of Secrets module
from random import SystemRandom
_sysrand = SystemRandom()
randbits = _sysrand.getrandbits
choice = _sysrand.choice
def randbelow(exclusive_upper_bound):
return _sysrand._randbelow(exclusive_upper_bound)
DEFAULT_ENTROPY = 32 # number of bytes to return by default
def token_bytes(nbytes=None):
if nbytes is None:
nbytes = DEFAULT_ENTROPY
return os.urandom(nbytes)
def token_hex(nbytes=None):
return binascii.hexlify(token_bytes(nbytes)).decode('ascii')
def token_urlsafe(nbytes=None):
tok = token_bytes(nbytes)
return base64.urlsafe_b64encode(tok).rstrip(b'=').decode('ascii')
Slide 60
Slide 60 text
SystemRandom
Uses OS as a source of randomness
Not available on all systems
Does not rely on software states
Sequences are not repeatable
Slide 61
Slide 61 text
/dev/random
Will block without sufficient entropy
Relies on “the kernel entropy pool”
Slower than /dev/urandom
Slide 62
Slide 62 text
/dev/urandom
Will not block without sufficient entropy
Relies on “the kernel entropy pool”
Faster than /dev/random
Theoretically vulnerable to attack
Slide 63
Slide 63 text
Using the secrets module to get tokens
import secrets
token1 = secrets.token_hex(16)
token2 = secrets.token_hex(10)
print(token1)
print(token2)
d2bdc979d5ecec0dccf67854459c5284
584d93ac921d3c74be9c
Slide 64
Slide 64 text
Using the secrets module for password
generation
import secrets
import string
alphabet = string.ascii_letters + string.digits
password = ''.join(secrets.choice(alphabet)
for i in range(10))
print(password)
i3OFMKPr8q
Slide 65
Slide 65 text
The secrets module:
not the end all be all.
@amandasopkin
Slide 66
Slide 66 text
Python’s “nuclear reactor” of
Randomness
"...folks really are better off learning to use things
like cryptography.io for security sensitive software, so
this change is just about harm mitigation given that it's
inevitable that a non-trivial proportion of the millions
of current and future Python developers won't do that."
Slide 67
Slide 67 text
Cryptography.io
Slide 68
Slide 68 text
Let’s wrap up...
@amandasopkin
Slide 69
Slide 69 text
Is very important for security
Difficult to truly achieve
Can be simulated
Randomness...
@amandasopkin
Slide 70
Slide 70 text
Thank you!
@amandasopkin
Slide 71
Slide 71 text
Sources:
● Icons taken from flaticon.com
● https://crypto.stackexchange.com/questions/51232/using-
32-hexadecimal-digits-vs-ascii-equivalent-16-character-
password
● https://dev.to/walker/pseudo-random-numbers-in-python-f
rom-arithmetic-to-probability-distributions
● Wired Magazine
● The Washington Post
● NYT
● Dilbert