Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Binary Processing

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for Inndy Inndy
October 31, 2015

Binary Processing

This slide licensed under CC 4.0 BY-SA (https://creativecommons.org/licenses/by-sa/4.0/)

Avatar for Inndy

Inndy

October 31, 2015
Tweet

More Decks by Inndy

Other Decks in Technology

Transcript

  1. BINARY PROCESSING OUTLINE ▸ Data Type ▸ Text Encoding ▸

    Binary Encoding ▸ Integer Representation and Endian ▸ Memory Model ▸ Data Structure ▸ Practice in Python ▸ xor-tool
  2. BINARY PROCESSING SAMPLE OF DATA TYPES ▸ Text Data ▸

    Markdown Document, Hypertext Markup Language (HTML) ▸ Text Stream ▸ Hypertext Transfer Protocol (HTTP) ▸ Binary Data ▸ Portable Network Graphics (PNG) ▸ Binary Stream ▸ Secure Shell (SSH), Secure Socket Layer (SSL)
  3. BINARY PROCESSING TEXT ENCODING ▸ Common Encoding ▸ ASCII, UTF-8,

    UCS2, latin1 ▸ Common Encoding in Asia ▸ Big5, HKSCS, ShiftJIS, GBK
  4. BINARY PROCESSING TEXT ENCODING — UTF-8 ▸ A modern Unicode

    standard ▸ Running length encoding ▸ Emoji! ▸ ASCII is subset of UTF-8
  5. BINARY PROCESSING TEXT ENCODING — BIG5 FAMILY ▸ Traditional Chinese

    characters ▸ Include CP950, Big5, HKSCS, they are different in Python and iconv ▸ Considered obstacle, legacy encoding
  6. BINARY PROCESSING TEXT ENCODING — LATIN1 ▸ 0 ~ 0xFF

    are used, so any bytes sequence can be stored in this encoding (Useful technique in Python)
  7. BINARY PROCESSING BINARY ENCODING — HEX ENCODE ▸ Double size

    ▸ Look like this: 407f9849b457041a30bb6b4f091a50f4
  8. BINARY PROCESSING BINARY ENCODING — URL ENCODE ▸ URL Encode

    ▸ Used in browser ▸ from urllib.parse import quote, unquote
  9. BINARY PROCESSING BINARY ENCODING — BASE 16/32/64/85 ▸ Size Expand

    Rate: ▸ base16 (hex): 2 ▸ base32: 8/5 ▸ base64: 4/3 ▸ base85: 5/4 ▸ bas64 look like this: fI3UMEbM9EoGSXQjl3LD/A==
  10. BINARY PROCESSING INTEGER REPRESENTATION ▸ What is basic memory unit?

    ▸ Byte ▸ How does computer store integers? ▸ From lowest byte to highest byte called Little-Endian ▸ Ex: int32_t n = 0xaabbccdd; ▸ In memory: dd cc bb aa ▸ Let's try!
  11. BINARY PROCESSING MEMORY MODEL ▸ Address (integer) and Data (byte

    cells) ▸ Minimal addressing unit is Byte ▸ Use HxD / Cheat Engine / gdb to inspect process memory ▸ Use HxD / xxd / hexdump to inspect binary file ▸ Let’s try! ▸ Pointer: Actually it’s an integer, store address in data
  12. BINARY PROCESSING DATA STRUCTURE ▸ A classic file structure contains:

    ▸ Magic Signature ▸ Header ▸ Size ▸ Table ▸ Offset (Address)
  13. BINARY PROCESSING PRACTICE IN PYTHON - BASIC TYPES STR BYTES

    BYTEARRAY str.encode("encoding") bytes.decode("encoding") b'rawbytedata' 'This is str' bytearray(b'converting') 3 BASIC TYPES
  14. BINARY PROCESSING PRACTICE IN PYTHON - BASIC TYPES s =

    'Hello, Hacker' b = s.encode('ascii') a = bytearray(b) print(type(s), type(b), type(a))
  15. BINARY PROCESSING PRACTICE IN PYTHON - FILE OPERATION # generate

    a binary file with open('out.dat', 'wb') as fout: fout.write(bytes([ i for i in range(256) ]))
  16. BINARY PROCESSING PRACTICE IN PYTHON - FILE OPERATION # read

    a binary file with open('out.dat', 'rb') as fin: data = fin.read() print(data.hex()) # Py3.5 # Py3.4 # import binascii # print(binascii.hexlify(data)) # Py2.7 # print(data.encode('hex'))
  17. BINARY PROCESSING PRACTICE IN PYTHON - XOR A FILE content

    = open('data.bin', 'rb') content = bytearray(content) for i in range(len(content)): content[i] ^= 0x9c open('out.bin', 'wb').write(content)
  18. BINARY PROCESSING PRACTICE IN PYTHON - BASE_XX ENCODING import base64

    data = bytes.fromhex('61626364') print(base64.b85encode(data)) print(base64.b64encode(data)) print(base64.b32encode(data)) print(data.hex()) # also b**decode(b'encoded data')
  19. BINARY PROCESSING PRACTICE IN PYTHON - XOR ENCRYPT AND BASE64

    ENCODE import base64 data = input('Data to be encrypt:') data = bytearray(data.encode('utf-8')) key = input('Key:').encode('utf-8') len_key = len(key) for i in range(len(data)): data[i] ^= key[i % len_key] data = base64.b64encode(data) print(data.decode('ascii'))
  20. BINARY PROCESSING PRACTICE IN PYTHON - BASE64 DECODE AND XOR

    DECRYPT import base64 data = input('Data to be decrypt:') data = data.encode('ascii') data = bytearray(base64.b64decode(data)) key = input('Key:').encode('utf-8') len_key = len(key) for i in range(len(data)): data[i] ^= key[i % len_key] data = bytes(data).decode('utf-8') print(data)
  21. BINARY PROCESSING XOR TOOL inndy@inndy-mac ~$ pip2 install xortool Collecting

    xortool Installing collected packages: xortool Successfully installed xortool-0.95 inndy@inndy-mac ~$ xortool data -c ' ' The most probable key lengths: 1: 33.0% 19: 13.5% 21: 9.8% 23: 9.0% 25: 7.9% 28: 6.9% 32: 6.3% 36: 4.8% 38: 5.3% 40: 3.7% Key-length can be 4*n 1 possible key(s) of length 1: \xa5 Found 1 plaintexts with 95.0%+ printable characters See files filename-key.csv, filename-char_used-perc_printable.csv inndy@inndy-mac ~$ xortool data -c ' ' -l 1 1 possible key(s) of length 1: \xa5 Found 1 plaintexts with 95.0%+ printable characters See files filename-key.csv, filename-char_used-perc_printable.csv inndy@inndy-mac ~$ cat xortool_out/0.out