Slide 1

Slide 1 text

BINARY PROCESSING NTUST INFORMATION SECURITY RESEARCH CLUB

Slide 2

Slide 2 text

BINARY PROCESSING OUTLINE ▸ Data Type ▸ Text Encoding ▸ Binary Encoding ▸ Integer Representation and Endian ▸ Memory Model ▸ Data Structure ▸ Practice in Python ▸ xor-tool

Slide 3

Slide 3 text

TEXT DATA BINARY STREAM Series of Data Decoded Binary

Slide 4

Slide 4 text

BINARY PROCESSING SAMPLE OF DATA TYPES ▸ Text Data ▸ Markdown Document, Hypertext Markup Language (HTML) ▸ Text Stream ▸ Hypertext Transfer Protocol (HTTP) ▸ Binary Data ▸ Portable Network Graphics (PNG) ▸ Binary Stream ▸ Secure Shell (SSH), Secure Socket Layer (SSL)

Slide 5

Slide 5 text

BINARY PROCESSING TEXT ENCODING ▸ Common Encoding ▸ ASCII, UTF-8, UCS2, latin1 ▸ Common Encoding in Asia ▸ Big5, HKSCS, ShiftJIS, GBK

Slide 6

Slide 6 text

BINARY PROCESSING TEXT ENCODING — ASCII ▸ First standardized encoding ▸ 0x20 (‘ ‘) ~ 0x7E (‘~’)

Slide 7

Slide 7 text

BINARY PROCESSING TEXT ENCODING — UTF-8 ▸ A modern Unicode standard ▸ Running length encoding ▸ Emoji! ▸ ASCII is subset of UTF-8

Slide 8

Slide 8 text

BINARY PROCESSING TEXT ENCODING — BIG5 FAMILY ▸ Traditional Chinese characters ▸ Include CP950, Big5, HKSCS, they are different in Python and iconv ▸ Considered obstacle, legacy encoding

Slide 9

Slide 9 text

BINARY PROCESSING TEXT ENCODING — LATIN1 ▸ 0 ~ 0xFF are used, so any bytes sequence can be stored in this encoding (Useful technique in Python)

Slide 10

Slide 10 text

BINARY PROCESSING BINARY ENCODING — HEX ENCODE ▸ Double size ▸ Look like this: 407f9849b457041a30bb6b4f091a50f4

Slide 11

Slide 11 text

BINARY PROCESSING BINARY ENCODING — URL ENCODE ▸ URL Encode ▸ Used in browser ▸ from urllib.parse import quote, unquote

Slide 12

Slide 12 text

BINARY PROCESSING BINARY ENCODING — BASE 16/32/64/85 ▸ Size Expand Rate: ▸ base16 (hex): 2 ▸ base32: 8/5 ▸ base64: 4/3 ▸ base85: 5/4 ▸ bas64 look like this: fI3UMEbM9EoGSXQjl3LD/A==

Slide 13

Slide 13 text

BINARY PROCESSING INTEGER REPRESENTATION ▸ What is basic memory unit? ▸ Byte ▸ How does computer store integers? ▸ From lowest byte to highest byte called Little-Endian ▸ Ex: int32_t n = 0xaabbccdd; ▸ In memory: dd cc bb aa ▸ Let's try!

Slide 14

Slide 14 text

MEMORY MODEL

Slide 15

Slide 15 text

BINARY PROCESSING MEMORY MODEL ▸ Address (integer) and Data (byte cells) ▸ Minimal addressing unit is Byte ▸ Use HxD / Cheat Engine / gdb to inspect process memory ▸ Use HxD / xxd / hexdump to inspect binary file ▸ Let’s try! ▸ Pointer: Actually it’s an integer, store address in data

Slide 16

Slide 16 text

BINARY PROCESSING DATA STRUCTURE ▸ A classic file structure contains: ▸ Magic Signature ▸ Header ▸ Size ▸ Table ▸ Offset (Address)

Slide 17

Slide 17 text

BINARY PROCESSING DATA STRUCTURE BMP FILE Practice:

Slide 18

Slide 18 text

BINARY PROCESSING DATA STRUCTURE GIF FILE Practice:

Slide 19

Slide 19 text

BINARY PROCESSING DATA STRUCTURE PNG FILE Practice:

Slide 20

Slide 20 text

BINARY PROCESSING PRACTICE IN PYTHON - BASIC TYPES STR BYTES BYTEARRAY str.encode("encoding") bytes.decode("encoding") b'rawbytedata' 'This is str' bytearray(b'converting') 3 BASIC TYPES

Slide 21

Slide 21 text

BINARY PROCESSING PRACTICE IN PYTHON - BASIC TYPES s = 'Hello, Hacker' b = s.encode('ascii') a = bytearray(b) print(type(s), type(b), type(a))

Slide 22

Slide 22 text

BINARY PROCESSING PRACTICE IN PYTHON - FILE OPERATION # generate a binary file with open('out.dat', 'wb') as fout: fout.write(bytes([ i for i in range(256) ]))

Slide 23

Slide 23 text

BINARY PROCESSING PRACTICE IN PYTHON - FILE OPERATION # read a binary file with open('out.dat', 'rb') as fin: data = fin.read() print(data.hex()) # Py3.5 # Py3.4 # import binascii # print(binascii.hexlify(data)) # Py2.7 # print(data.encode('hex'))

Slide 24

Slide 24 text

BINARY PROCESSING PRACTICE IN PYTHON - XOR A FILE content = open('data.bin', 'rb') content = bytearray(content) for i in range(len(content)): content[i] ^= 0x9c open('out.bin', 'wb').write(content)

Slide 25

Slide 25 text

BINARY PROCESSING PRACTICE IN PYTHON - BASE_XX ENCODING import base64 data = bytes.fromhex('61626364') print(base64.b85encode(data)) print(base64.b64encode(data)) print(base64.b32encode(data)) print(data.hex()) # also b**decode(b'encoded data')

Slide 26

Slide 26 text

BINARY PROCESSING PRACTICE IN PYTHON - XOR ENCRYPT AND BASE64 ENCODE import base64 data = input('Data to be encrypt:') data = bytearray(data.encode('utf-8')) key = input('Key:').encode('utf-8') len_key = len(key) for i in range(len(data)): data[i] ^= key[i % len_key] data = base64.b64encode(data) print(data.decode('ascii'))

Slide 27

Slide 27 text

BINARY PROCESSING PRACTICE IN PYTHON - BASE64 DECODE AND XOR DECRYPT import base64 data = input('Data to be decrypt:') data = data.encode('ascii') data = bytearray(base64.b64decode(data)) key = input('Key:').encode('utf-8') len_key = len(key) for i in range(len(data)): data[i] ^= key[i % len_key] data = bytes(data).decode('utf-8') print(data)

Slide 28

Slide 28 text

BINARY PROCESSING XOR TOOL inndy@inndy-mac ~$ pip2 install xortool Collecting xortool Installing collected packages: xortool Successfully installed xortool-0.95 inndy@inndy-mac ~$ xortool data -c ' ' The most probable key lengths: 1: 33.0% 19: 13.5% 21: 9.8% 23: 9.0% 25: 7.9% 28: 6.9% 32: 6.3% 36: 4.8% 38: 5.3% 40: 3.7% Key-length can be 4*n 1 possible key(s) of length 1: \xa5 Found 1 plaintexts with 95.0%+ printable characters See files filename-key.csv, filename-char_used-perc_printable.csv inndy@inndy-mac ~$ xortool data -c ' ' -l 1 1 possible key(s) of length 1: \xa5 Found 1 plaintexts with 95.0%+ printable characters See files filename-key.csv, filename-char_used-perc_printable.csv inndy@inndy-mac ~$ cat xortool_out/0.out