Engineering Prime Numbers

Engineering Primes Taking the magic out of magic numbers George
Tankersley F2’17 @gtank__

Engineering Primes Taking the magic out of magic numbers George
Tankersley F2’17 @gtank__ Tha ’s o un s es

Elliptic Curve Cryptography Crash Course What’s an elliptic curve? A
group of points (x, y) satisfying an equation y2 = x3 + ax + b over the finite field of integers modulo a prime.

Elliptic Curve Cryptography Crash Course Curves over real numbers Curve
1 over a finite field (p = 61)

Elliptic Curve Cryptography Crash Course You can do several things
with curve points: Addition: P + P = 2P Multiplication: 5P = P + P + P + P + P Negation: P + (-P) = O (O is the ✨point at infinity✨) Adding points involves many multiplications in the underlying field.

Why does this matter?? Elliptic curves let us use much
smaller fields for the same security. Comparative field sizes (in bits) for a given security level Security Level RSA Traditional DH ECC 128 3072 3072 256 192 7680 7680 384 256 15360 15360 512

Really Big Numbers 256 bits of what? REALLY BIG NUMBERS.
Really big. You just won't believe how vastly, hugely, mind-bogglingly big they are. They are much bigger than your machine can natively represent.

Really Big Numbers To represent them, we choose a radix
(or base) and decompose into multiple limbs. 256 bits = 8 x 32 N = a 0 + (a 1 * 232) + (a 2 * 264) + … + (a 7 * 2224) = a 0 + (a 1 << 32) + (a 2 << 64) + … + (a 7 << 224) a 7 a 6 a 5 a 4 a 3 a 2 a 1 a 0 uint32

Really Big Numbers To represent them, we choose a radix
(or base) and decompose into multiple limbs. 256 bits = 4 x 64 N = a 0 + (a 1 * 264) + (a 2 * 2128) + (a 3 * 2192) = a 0 + (a 1 << 64) + (a 2 << 128) + (a 3 << 192) a 3 a 2 a 1 a 0 uint64

Really Big Numbers This is called multi-precision (or bignum) arithmetic.
Think of elementary multiplication. It’s the same thing! 2 5 x 5 1 2 5

Really Big Numbers This is called multi-precision (or bignum) arithmetic.
Think of elementary multiplication. It’s the same thing! 2 5 x 5 1 2 5 a 1 a 0 b 0 r 0 r 1 r 2

Really Big Numbers a 0 * b 0 = 5
* 5 = 2 5 2 5 x 5 1 2 5 a 1 a 0 b 0 r 0 r 1 r 2

* 5 = 2 5 a 1 * b 0 = 2 * 5 + c 0 = 1 2 2 5 x 5 1 2 5 a 1 a 0 b 0 r 0 r 1 r 2

* 5 = 2 5 a 1 * b 0 = 2 * 5 + c 0 = 1 2 a 2 * b 0 = 0 * 5 + c 1 = 1 2 5 x 5 1 2 5 a 1 a 0 b 0 r 0 r 1 r 2

Why does this matter?? Elliptic curves let us do the
same things, faster. MUCH FASTER. Smaller underlying field size => Fewer limbs => Fewer operations => ZOOM ZOOM As a bonus, smaller representations use less bandwidth!

The underlying field Recall that “field” just means “integers modulo
a prime” for all we care. Z/3Z = { 0, 1, 2 } 1 + 2 = 0 mod 3 1 + 0 = 1 mod 3 1 + (-1) = 0 mod 3 5 = 2 mod 3 Field size is how big the prime is / how many elements, and correlates to security. The shape of the field’s prime matters for performance.

The underlying field Recall that “field” just means “integers modulo
a prime” for all we care. Z/3Z = { 0, 1, 2 } 1 + 2 = 0 mod 3 1 + 0 = 1 mod 3 1 + (-1) = 0 mod 3 5 = 2 mod 3 Field size is how big the prime is / how many elements, and correlates to security. The shape of the field’s prime matters for performance. SA H ?

Mersenne Primes (2k - 1) Given a number in base
2, it’s fast to reduce it by a number close to a power of 2. Computers use base 2! Mersenne primes are very close to a power of two! Let n = 7 = 23 - 1, then we see that 23 ≡ 1 (mod n) To reduce x = 18 mod 7, first convert x to base 23 by grouping into 3-bit words: x = (010010) b x’ = (010) b * 23 + (010) b = 2 * 8 + 2 (mod 7) x’ = (010) b * 1 + (010) b = 2 * 1 + 2 (mod 7) x’ = (010) b + (010) b = 2 + 2 = 4 (mod 7)

Mersenne Primes (2k - 1) Mersenne primes are very rare
:( In the 32-bit range, there are 8 of them. None at all between 2127 - 1 and 2521 - 1. Also, composite k will never produce a prime, so limb alignment is always going to be sub-optimal. Lack of choice makes this worse. A little more flexibility would be nice.

Crandall Primes (2k - c) Same fast-reduction identity applies. Curve25519
uses p = 2255 - 19, so we have 2255 ≡ 19 mod p To reduce x in the range p < x < p2 can split into 255-bit “high” and “low” halves: x = a * 2255 + b (mod 2255 - 19) = a * 19 + b (mod 2255 - 19) Generally, a * 2k + b ≡ a * c + b (mod 2k - c)

Crandall Primes (2k - c) Same fast-reduction identity applies. Curve25519
uses p = 2255 - 19, so we have 2255 ≡ 19 mod p To reduce x in the range p < x < p2 can split into 255-bit “high” and “low” halves: x = a * 2255 + b (mod 2255 - 19) = a * 19 + b (mod 2255 - 19) This multiplication risks overflowing and requiring its own reduction step. Generally, a * 2k + b ≡ a * c + b (mod 2k - c)

Crandall Primes (2k - c) Crandall primes are not rare!
They also don’t have to have prime k, and thus give us a lot more flexibility in choosing a well-aligned limb schedule. The most serious constraint is the need for a small c.

Crandall Primes (2k - c) I chose my prime 2^255
− 19 according to the following criteria: primes as close as possible to a power of 2 save time in field operations (as in, e.g, [9]), with no effect on (conjectured) security level; primes slightly below 32k bits, for some k, allow public keys to be easily transmitted in 32-bit words, with no serious concerns regarding wasted space; k = 8 provides a comfortable security level. I considered the primes 2^255 + 95, 2^255 − 19, 2^255 − 31, 2^254 + 79, 2^253 + 51, and 2^253 + 39, and selected 2^255 − 19 because 19 is smaller than 31, 39, 51, 79, 95. (Bernstein, “Curve25519: new Diffie-Hellman speed records”)

Limb Schedules The divisibility of the bitsize matters. 256 bits
= 4 x 64 This a uniform, saturated representation. Very tidy. In practice, though... a 3 a 2 a 1 a 0 uint64

Limb Schedules This choice is absurdly platform-specific: Why split 255-bit
integers into ten 26-bit pieces, rather than nine 29-bit pieces or eight 32-bit pieces? Answer: The coefficients of a polynomial product do not fit into the Pentium M’s fp registers if pieces are too large. The cost of handling larger coefficients outweighs the savings of handling fewer coefficients. The overall time for 29-bit pieces is sufficiently competitive to warrant further investigation, but so far I haven’t been able to save time this way. I’m sure that 32-bit pieces, the most common choice in the literature, are a bad idea. Of course, the same question must be revisited for each CPU. (Bernstein)

Limb Schedules The divisibility of the bitsize matters. 255 bits
= 5 x 51 Uniform, unsaturated. Headspace allows lazy reduction. 51 bits _ a 3 _ a 2 _ a 1 _ a 0 uint64

Limb Schedules Vector instructions change everything. Strange widths, and even
more expensive carries. SIMD-friendly design is where it’s at now.

The rabbit hole 3.2 The Goldilocks prime, 2448 − 2224
− 1 I chose the Solinas trinomial prime p := 2448 − 2224 − 1. I call this the “Goldilocks” prime because its form defines the golden ratio φ ≡ 2224. Because 224 = 32 · 7 = 28 · 8 = 56 · 4, this prime supports fast arithmetic in radix 228 or 232 (on 32-bit machines) or 256 (on 64-bit machines). With 16, 28-bit limbs it works well on vector units such as NEON. Furthermore, radix-264 implementations are possible with greater efficiency than most of the NIST primes. Mike Hamburg, “Ed448-Goldilocks, a new elliptic curve”

Questions? George Tankersley F2’17 @gtank__

Engineering Prime Numbers

Engineering Prime Numbers

George Tankersley

More Decks by George Tankersley

Other Decks in Programming

Featured

Transcript

Engineering Primes Taking the magic out of magic numbers George

Engineering Primes Taking the magic out of magic numbers George

Elliptic Curve Cryptography Crash Course What’s an elliptic curve? A

Elliptic Curve Cryptography Crash Course Curves over real numbers Curve

Elliptic Curve Cryptography Crash Course You can do several things

Why does this matter?? Elliptic curves let us use much

Really Big Numbers 256 bits of what? REALLY BIG NUMBERS.

Really Big Numbers To represent them, we choose a radix

Really Big Numbers To represent them, we choose a radix

Really Big Numbers This is called multi-precision (or bignum) arithmetic.

Really Big Numbers This is called multi-precision (or bignum) arithmetic.

Really Big Numbers a 0 * b 0 = 5

Really Big Numbers a 0 * b 0 = 5

Really Big Numbers a 0 * b 0 = 5

Really Big Numbers a 0 * b 0 = 5

Really Big Numbers a 0 * b 0 = 5

Really Big Numbers a 0 * b 0 = 5

Why does this matter?? Elliptic curves let us do the

The underlying field Recall that “field” just means “integers modulo

The underlying field Recall that “field” just means “integers modulo

Mersenne Primes (2k - 1) Given a number in base

Mersenne Primes (2k - 1) Mersenne primes are very rare

Crandall Primes (2k - c) Same fast-reduction identity applies. Curve25519

Crandall Primes (2k - c) Same fast-reduction identity applies. Curve25519

Crandall Primes (2k - c) Crandall primes are not rare!

Crandall Primes (2k - c) I chose my prime 2^255

Limb Schedules The divisibility of the bitsize matters. 256 bits

Limb Schedules This choice is absurdly platform-specific: Why split 255-bit

Limb Schedules The divisibility of the bitsize matters. 255 bits

Limb Schedules Vector instructions change everything. Strange widths, and even

The rabbit hole 3.2 The Goldilocks prime, 2448 − 2224

Questions? George Tankersley F2’17 @gtank__