SHA-1 backdooring and exploitation

Slide 1

Slide 1 text

Slide 2

Slide 2 text

brought to you by Maria Eichlseder, Florian Mendel, Martin Schläffer TU Graz, .at; cryptanalysis @angealbertini Corkami, .de; binary kung-fu @veorq Kudelski Security, .ch; theory and propaganda :-)

Slide 3

Slide 3 text

1. WTF is a hash function backdoor? 2. backdooring SHA1 with cryptanalysis 3. exploitation! collisions!

Slide 4

Slide 4 text

TL;DR:

Slide 5

Slide 5 text

who’s interested in crypto backdoors?

Slide 6

Slide 6 text

& Dual_EC speculation — https://projectbullrun.org

Slide 7

Slide 7 text

Clipper (1993)

Slide 8

Slide 8 text

crypto researchers?

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Young/Yung malicious cipher (2003) - compresses texts to leak key bits in ciphertexts - blackbox only (internals reveal the backdoor) - other “cryptovirology” schemes

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

2011: theoretical framework, but nothing useful

Slide 13

Slide 13 text

what’s a crypto backdoor?

Slide 14

Slide 14 text

not an implementation backdoor example: RC4 C implementation (Wagner/Biondi) #define TOBYTE(x) (x) & 255 #define SWAP(x,y) do { x^=y; y^=x; x^=y; } while (0) static unsigned char A[256]; static int i=0, j=0; unsigned char encrypt_one_byte(unsigned char c) { int k; i = TOBYTE(i+1); j = TOBYTE(j + A[i]); SWAP(A[i], A[j]); k = TOBYTE(A[i] + A[j]); return c ^ A[k]; }

Slide 15

Slide 15 text

a backdoor (covert) isn’t a trapdoor (overt) RSA has a trapdoor, NSA has backdoors VSH is a trapdoor hash based on RSA

Slide 16

Slide 16 text

backdoor in a crypto hash?

Slide 17

Slide 17 text

“some secret property that allows you to efficiently break the hash”

Slide 18

Slide 18 text

“break” can be about collisions, preimages… how to model the stealthiness of the backdoor… exploitation can be deterministic or randomized…

Slide 19

Slide 19 text

role reversal Eve wants to achieve some security property Alice and Bob (the users) are the adversaries

Slide 20

Slide 20 text

definitions malicious hash = pair of algorithms exploit() either “static” or “dynamic” generate() randomness hash function H backdoor b exploit() hash function H collision/preimage backdoor b challenge

Slide 21

Slide 21 text

taxonomy static collision backdoor returns constant m and m’ such that H(m)=H(m’) dynamic collision backdoor returns random m and m’ such that H(m)=H(m’) static preimage backdoor returns m such that H(m) has low entropy dynamic preimage backdoor given h, returns m such that H(m)=h

Slide 22

Slide 22 text

stealth definitions undetectability vs undiscoverability detect() may also return levels of suspicion H may be obfuscated... detect() hash function H exploit() ? discover() hash function H backdoor b exploit()

Slide 23

Slide 23 text

our results dynamic collision backdoor valid structured files with arbitrary payloads detectable, but undiscoverable and as hard to discover as to break SHA-1

Slide 24

Slide 24 text

SHA-1

Slide 25

Slide 25 text

SHA-1 everywhere RSA-OAEP, “RSAwithSHA1”, HMAC, PBKDF2, etc. ⇒ in TLS, SSH, IPsec, etc. integrity check: git, bootloaders, HIDS/FIM, etc.

Slide 26

Slide 26 text

SHA-1

Slide 27

Slide 27 text

but no collision published yet actual complexity unclear (>260)

Slide 28

Slide 28 text

Differential cryptanalysis for collisions “perturb-and-correct”

Slide 29

Slide 29 text

2 stages (offline/online) 1. find a good differential characteristic = one of high probability 2. find conforming messages with message modification techniques

Slide 30

Slide 30 text

find a characteristic: linearization low-probability high-probability 2-40 2-15 2-40

Slide 31

Slide 31 text

find conforming messages low-probability part: “easy”, K 1 unchanged use automated tool to find a conforming message round 2: try all 232 K 2 ‘s, repeat 28 times (cost 240) consider constant K 2 as part of the message! round 3: do the same to find a K 3 (total cost 248) repeating the 240 search of K 2 28 times…. round 4: find K 4 in negligible time iterate to minimize the differences in the constants...

Slide 32

Slide 32 text

collision! 1-block, vs. 2-block collisions for previous attacks

Slide 33

Slide 33 text

empty

Slide 34

Slide 34 text

but it’s not the real SHA-1!

Slide 35

Slide 35 text

“custom” standards are common in proprietary systems (encryption appliances, set-top boxes, etc.) motivations: customer-specific crypto (customers’ request) “other reasons”

Slide 36

Slide 36 text

how to turn garbage collisions into useful collisions? (= 2 valid files with arbitrary content)

Slide 37

Slide 37 text

basic idea where H(M 1 )=H(M 2 ) and M x is essentially “process payload x” M 1 M 2 Payload 1 Payload 2 Payload 1 Payload 2

Slide 38

Slide 38 text

constraints differences (only in) the first block difference in the first four bytes ⇒ 4-byte signatures corrupted

Slide 39

Slide 39 text

PE? (Win* executables, etc.) differences forces EntryPoint to be at > 0x40000000 ⇒ 1GiB (not supported by Windows)

Slide 40

Slide 40 text

PE = fail

Slide 41

Slide 41 text

ELF, Mach-O = fail (≥ 4-byte signature at offset 0)

Slide 42

Slide 42 text

shell scripts?

Slide 43

Slide 43 text

# # EOL //block 1 start //block 2 start //same payload

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

RAR/7z scanned forward ≥ 4-byte signature :-( but signature can start at any offset :-D ⇒ payload = 2 concatenated archives

Slide 47

Slide 47 text

killing the 1st signature byte disables the top archive

Slide 48

Slide 48 text

COM/MBR?

Slide 49

Slide 49 text

COM/MBR (DOS executable/Master Boot Record) no signature! start with x86 (16 bits) code at offset 0 like shell scripts, skip initial garbage JMP to distinct addr rather than comments

Slide 50

Slide 50 text

JMP address1 JMP address2 address1: address2: //block 1 start //block 2 start //common payload

Slide 51

Slide 51 text

JPEG?

Slide 52

Slide 52 text

JPEG 2-byte signature 0xFFD8 sequence of chunks idea message 1: first chunk “commented” message 2: first chunk processed

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

polyglots 2 distinct files, 3 valid file formats! ~virtual multicollisions

Slide 56

Slide 56 text

> msha1.py mbr_shell_rar*.pdf 5a827999 82b1c71a 5141963a b389abb9 mbr_shell_rar0.pdf 10382a6d3c949408d7cafaaf6d110a9e23230416 mbr_shell_rar1.pdf 10382a6d3c949408d7cafaaf6d110a9e23230416 > msha1.py jpg-rar*.jpg 5a827999 9b73a440 71599fc5 0c8a53e4 jpg-rar0.jpg 7a00042714d8ee6f4978193b07df705b652d0e39 jpg-rar1.jpg 7a00042714d8ee6f4978193b07df705b652d0e39 more magic: just 2 files here

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

Conclusions

Slide 59

Slide 59 text

Implications for SHA-1 security? None. We did not improve attacks on the unmodified SHA-1.

Slide 60

Slide 60 text

Did NSA use this trick when designing SHA-1 in 1995? Probably not, because 1) cryptanalysis techniques are known since ~2004 2) the constants look like NUMSN (√2 √3 √5 √10) 3) remember the SHA-0 fiasco :)

Slide 61

Slide 61 text

Can you do the same for SHA-256? Not at the moment. Good: SHA-256 uses distinct constants at each step ⇒more control to conform to the characteristic (but also more differences with the original) Not good: The best known attack is on 31 steps (in ~265), of 64 steps in total, so it might be difficult to find a useful 64-step characteristic

Slide 62

Slide 62 text

thank you! questions? Roads? Where we're going, we don't need roads. malicioussha1.github.io [email protected]