Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Binary Processing

Inndy
October 31, 2015

Binary Processing

This slide licensed under CC 4.0 BY-SA (https://creativecommons.org/licenses/by-sa/4.0/)

Inndy

October 31, 2015
Tweet

More Decks by Inndy

Other Decks in Technology

Transcript

  1. BINARY PROCESSING OUTLINE ▸ Data Type ▸ Text Encoding ▸

    Binary Encoding ▸ Integer Representation and Endian ▸ Memory Model ▸ Data Structure ▸ Practice in Python ▸ xor-tool
  2. BINARY PROCESSING SAMPLE OF DATA TYPES ▸ Text Data ▸

    Markdown Document, Hypertext Markup Language (HTML) ▸ Text Stream ▸ Hypertext Transfer Protocol (HTTP) ▸ Binary Data ▸ Portable Network Graphics (PNG) ▸ Binary Stream ▸ Secure Shell (SSH), Secure Socket Layer (SSL)
  3. BINARY PROCESSING TEXT ENCODING ▸ Common Encoding ▸ ASCII, UTF-8,

    UCS2, latin1 ▸ Common Encoding in Asia ▸ Big5, HKSCS, ShiftJIS, GBK
  4. BINARY PROCESSING TEXT ENCODING — UTF-8 ▸ A modern Unicode

    standard ▸ Running length encoding ▸ Emoji! ▸ ASCII is subset of UTF-8
  5. BINARY PROCESSING TEXT ENCODING — BIG5 FAMILY ▸ Traditional Chinese

    characters ▸ Include CP950, Big5, HKSCS, they are different in Python and iconv ▸ Considered obstacle, legacy encoding
  6. BINARY PROCESSING TEXT ENCODING — LATIN1 ▸ 0 ~ 0xFF

    are used, so any bytes sequence can be stored in this encoding (Useful technique in Python)
  7. BINARY PROCESSING BINARY ENCODING — HEX ENCODE ▸ Double size

    ▸ Look like this: 407f9849b457041a30bb6b4f091a50f4
  8. BINARY PROCESSING BINARY ENCODING — URL ENCODE ▸ URL Encode

    ▸ Used in browser ▸ from urllib.parse import quote, unquote
  9. BINARY PROCESSING BINARY ENCODING — BASE 16/32/64/85 ▸ Size Expand

    Rate: ▸ base16 (hex): 2 ▸ base32: 8/5 ▸ base64: 4/3 ▸ base85: 5/4 ▸ bas64 look like this: fI3UMEbM9EoGSXQjl3LD/A==
  10. BINARY PROCESSING INTEGER REPRESENTATION ▸ What is basic memory unit?

    ▸ Byte ▸ How does computer store integers? ▸ From lowest byte to highest byte called Little-Endian ▸ Ex: int32_t n = 0xaabbccdd; ▸ In memory: dd cc bb aa ▸ Let's try!
  11. BINARY PROCESSING MEMORY MODEL ▸ Address (integer) and Data (byte

    cells) ▸ Minimal addressing unit is Byte ▸ Use HxD / Cheat Engine / gdb to inspect process memory ▸ Use HxD / xxd / hexdump to inspect binary file ▸ Let’s try! ▸ Pointer: Actually it’s an integer, store address in data
  12. BINARY PROCESSING DATA STRUCTURE ▸ A classic file structure contains:

    ▸ Magic Signature ▸ Header ▸ Size ▸ Table ▸ Offset (Address)
  13. BINARY PROCESSING PRACTICE IN PYTHON - BASIC TYPES STR BYTES

    BYTEARRAY str.encode("encoding") bytes.decode("encoding") b'rawbytedata' 'This is str' bytearray(b'converting') 3 BASIC TYPES
  14. BINARY PROCESSING PRACTICE IN PYTHON - BASIC TYPES s =

    'Hello, Hacker' b = s.encode('ascii') a = bytearray(b) print(type(s), type(b), type(a))
  15. BINARY PROCESSING PRACTICE IN PYTHON - FILE OPERATION # generate

    a binary file with open('out.dat', 'wb') as fout: fout.write(bytes([ i for i in range(256) ]))
  16. BINARY PROCESSING PRACTICE IN PYTHON - FILE OPERATION # read

    a binary file with open('out.dat', 'rb') as fin: data = fin.read() print(data.hex()) # Py3.5 # Py3.4 # import binascii # print(binascii.hexlify(data)) # Py2.7 # print(data.encode('hex'))
  17. BINARY PROCESSING PRACTICE IN PYTHON - XOR A FILE content

    = open('data.bin', 'rb') content = bytearray(content) for i in range(len(content)): content[i] ^= 0x9c open('out.bin', 'wb').write(content)
  18. BINARY PROCESSING PRACTICE IN PYTHON - BASE_XX ENCODING import base64

    data = bytes.fromhex('61626364') print(base64.b85encode(data)) print(base64.b64encode(data)) print(base64.b32encode(data)) print(data.hex()) # also b**decode(b'encoded data')
  19. BINARY PROCESSING PRACTICE IN PYTHON - XOR ENCRYPT AND BASE64

    ENCODE import base64 data = input('Data to be encrypt:') data = bytearray(data.encode('utf-8')) key = input('Key:').encode('utf-8') len_key = len(key) for i in range(len(data)): data[i] ^= key[i % len_key] data = base64.b64encode(data) print(data.decode('ascii'))
  20. BINARY PROCESSING PRACTICE IN PYTHON - BASE64 DECODE AND XOR

    DECRYPT import base64 data = input('Data to be decrypt:') data = data.encode('ascii') data = bytearray(base64.b64decode(data)) key = input('Key:').encode('utf-8') len_key = len(key) for i in range(len(data)): data[i] ^= key[i % len_key] data = bytes(data).decode('utf-8') print(data)
  21. BINARY PROCESSING XOR TOOL inndy@inndy-mac ~$ pip2 install xortool Collecting

    xortool Installing collected packages: xortool Successfully installed xortool-0.95 inndy@inndy-mac ~$ xortool data -c ' ' The most probable key lengths: 1: 33.0% 19: 13.5% 21: 9.8% 23: 9.0% 25: 7.9% 28: 6.9% 32: 6.3% 36: 4.8% 38: 5.3% 40: 3.7% Key-length can be 4*n 1 possible key(s) of length 1: \xa5 Found 1 plaintexts with 95.0%+ printable characters See files filename-key.csv, filename-char_used-perc_printable.csv inndy@inndy-mac ~$ xortool data -c ' ' -l 1 1 possible key(s) of length 1: \xa5 Found 1 plaintexts with 95.0%+ printable characters See files filename-key.csv, filename-char_used-perc_printable.csv inndy@inndy-mac ~$ cat xortool_out/0.out