Slide 1

Slide 1 text

hexii: [ 'MZ', 0x90, 0x00, 0x03, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0xB8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, "?".repeat(0x3c - 0x19), { offset: 0x3c }, 0xF8, 0, 0, 0, { offset: 0xf8 }, 'PE\0\0', ], descriptions: [ [0, 'Dos header'], [2, 'Magic', 'MZ'], { offset: 0x3C }, [4, 'Pointer to PE', 0xF8], { offset: 0xF8 }, [0, 'PE Header'], [4, 'PE signature', 'PE\\0\\0'], ], highlights: [ [0x19, 0x23], ],

Slide 2

Slide 2 text

Hack.lu & cti-summit October 2023 A.K.A. Ange Albertini "Simple Binary Description" Information visualisation in information security. , A little bit of

Slide 3

Slide 3 text

- Reverse engineering and hex addict since the 80s. - Author of Corkami since 2007: - file formats (polyglots, collisions), visualisations… - PoC or GTFO since 2013 - Malware analyst since 2005: Symantec, Avira, Google. - In the Flare team since September 2023. About the author My own views and opinions. 3

Slide 4

Slide 4 text

I’m not an InfoVis professional. No formal training. Insert the “I have no idea what im doing” meme here. No dogma, no sacred rule. Disclaimer 4

Slide 5

Slide 5 text

My InfoSec ∩ InfoVis https://github.com/corkami/pics 5

Slide 6

Slide 6 text

Back in 2012 I made : PE 101 - A portable executable walkthrough Then various localizations: Visite guidée d'un exécutable Windows Ein Überblick über Windows Executables un recorrido por los ejecutables de windows يذﯾﻔﻧﺗ زودﻧﯾو فﻠﻣ لﻼﺧ لوﺟﺗ Windows実行可能形式 윈도우 실행 정보 Plik PE krok po kroku пошаговое руководство к исполняемым файлам Windows可执行文件详解 6

Slide 7

Slide 7 text

A single file <-> useful to many people. Initial work: 1 month of hobby time, by hand, with Inkscape. It's only covers a single executable file. But it was useful for many people to learn. -> the lower entry level, the more beneficial. Pixel art by Squiblydoo (2023) 7

Slide 8

Slide 8 text

Hard to update Need to be updated manually for any little change. 8

Slide 9

Slide 9 text

Hard to evolve Arabic text fields mixed with latin text 9

Slide 10

Slide 10 text

The real problems It doesn't scale with file size. Many structures are skipped to fit in the picture. Also: Is there a real need? 10

Slide 11

Slide 11 text

Infosec hates… Graphical stuff: - Opening a graphical tool - Choosing up a font - Having to use a different theme - your color theme is your god. Risks: - Installing a JavaScript framework (security risk) - Network connections (locked down computers) - Privacy, exfiltration, malware escape… The official description of the Dracula theme 11

Slide 12

Slide 12 text

Infovis in general (IMHO) Shiny one-shots. Rarely re-usable. …and bloated frameworks. Charles Joseph Minard's 1869 graphic of Napoleonic France's invasion of Russia 12

Slide 13

Slide 13 text

Reusable visualisations Charts, mind maps, sparklines, diagrams 13

Slide 14

Slide 14 text

From ‘101js’ to ‘sbud’ “Third Fourth time is the charm”. Automating PE 101 14

Slide 15

Slide 15 text

SBuD evolution A trail of fail tries https://twitter.com/angealbertini/status/517031673574477824 https://speakerdeck.com/ange/no-more-dumb-hex v1 (2014) v2 (2019) v3 (2023) 15

Slide 16

Slide 16 text

SBuD v1 (2014-2016) Using svg.js Very primitive. SVG.js not low level enough ? 16

Slide 17

Slide 17 text

SBuD v2 (2019) w/ Rafał Hirsch Using Constraints solving layouting. -> A nightmare to debug. Way too high level ! 17

Slide 18

Slide 18 text

SBuD v3 (2023) Rewritten from scratch. A visual playground. 18

Slide 19

Slide 19 text

It looks good! More importantly: no more dead pixels/data! Is it useful ? 19

Slide 20

Slide 20 text

SBuD v3 "Hey, it's actually fun!" "Let's add new formats!" . . . 120 dissections later… 1ba 7-Zip 8SVX a LZMA a LZMA (w/ EOS) ActiveMime Aiff Aiff-c Amiga Hunk APE archive Arj BMP (v1) BMP (v3) BMP (v5) Cab Chm Clangd Index Compound File Binary Compress (.Z) Cpio (ASCII) Cpio (binary) Dalvik Dicom Dolphin Dolphin header EBCDIC ELF Emf emf (mini) Excel 1.0 (Biff 1) Excel 2.0 (Biff2) Excel 97 (Biff8 stream) Exe (Dos Stub) Exe (IBM PC Dos 1.0 LINK.EXE) Exif (jpeg) Exif (png) Exif (tiff) Fat Mach-O Gemdos program format (TOS) Gif (old) Gif (v87) Gif (v89) Guid Partition Table Gzip gzip gzip Ico (Bmp) Ico (Png) ID3 v2.3.0 Intel Hex Java Class JPEG (App1) JPEG (JFIF) Jxl Jxl (naked) KWAJ (compress) Linear Bitmap Linear Executable LZ4 lzip lzip (multiple members) mach-O Mach-O (PPC 64b) Mach-O (PPC) mach-O 64 Master Boot Record Master File Table Matroska Video (EBML) Midi Mp4 (ISO BMFF) New Executable nro One OS/360 Off Pcx (Ega16) Pcx (Vga256) PE (compiled) PE (mini 64b) PE (mini) Photoshop Photoshop (mini) Photoshop (w/ IPTC) Png Portable Image File Preferred Executable Format Program Information File Quite ok image format Rar v1.4 Rar v4 Rar v5 Redhat Packed Manager Resource fork riff Riff-based Midi Riff-based Midi (test) Rtf Shell Link Small web format Symbolic link Tar Terse Executable Tga TIFF (big-endian) Tiff (image data after metadata) TIFF (little-endian) USB Flashing Format Volume Boot Record WAD (Doom) WAD (mini) wasm Wav wmf (header-less) wmf (mini) wmf (MS) Xz Y4m (grayscale) Y4m (YUV) Zip Zip (2 files) Zip (Multi-vol) Zstandard 20

Slide 21

Slide 21 text

A small gallery… (more at corkami/pics) OS/360 (EBCDIC, 1965) Midi Nintendo Switch (2017) QOI (2022) 21

Slide 22

Slide 22 text

Sbud …proved it's doable and saves a lot of time. It makes describing a file fun ! From hours to minutes of work. Ended up doing a lot w/ it. Explored formats and features instead of picking up colors. Connectable to parser. 22

Slide 23

Slide 23 text

SBuD self-imposed restrictions Local-only, no framework, no dependencies: + ready to use, works offline - no module SVG is generated from scratch with vanilla JS. Very lightweight SVG - w/ Inkscape extras. 23

Slide 24

Slide 24 text

Lesson learned 3 density of information types of file formats: Bit-based (with various directions), binary (nibbles values are important), text. Adapting to fonts and themes is a fundamental requirement. Sbud needs yet another rewrite, but this time, the data and experience can be reused. 24

Slide 25

Slide 25 text

25 Bits Bytes Chars 3 density of informations

Slide 26

Slide 26 text

✅ A single good looking style SBuD v3 26

Slide 27

Slide 27 text

No Sbud v3 release ? Too overengineered. Hardcoded values, fonts… No immediate need. Design of the “hex pills” -> 27

Slide 28

Slide 28 text

Everybody needs their custom color theme and coding font Typical feedback: - “Where’s the dark mode?” - “Shadows are too blurry.” - “You have no clue for design, don’t you?” Problem 28

Slide 29

Slide 29 text

What’s really needed ? What do we need, and can we “simply” solve it ? The true problem 29

Slide 30

Slide 30 text

A simple need with an awful result: “Hex viewer with descriptions and arrows” DOS Signature Offset to PE Header PE signature 30

Slide 31

Slide 31 text

Painful to make… - Screenshot, cut borders, arrows… - No consistency: different tools, different systems, different color themes. - JPEG artefacts. - Questionable color theme. 31

Slide 32

Slide 32 text

…painful to use… - Too much unneeded garbage bytes. - Who cares about the name of your hex editor? - Hard to match ASCII and Hex. - Offsets and sizes of structures aren't visible. 32

Slide 33

Slide 33 text

…painful to re-use! - Find and open the same file or retype the hex content? - Localization ? Alternate layout ? - Dead data. Unparsable. Ungeneratable. Yet doesn’t feel like rocket science… 33

Slide 34

Slide 34 text

It’s like sharing screenshots of samples’ hashes !!! With an unreadable font… 34

Slide 35

Slide 35 text

Demo! And now… 35

Slide 36

Slide 36 text

SBuD [v4] 1. Type data 2. Instant visualisation 3. Download [SVG, PNG, PDF] 36 https://corkami.github.io/sbud/hexii.html

Slide 37

Slide 37 text

hexii: [ 'MZ', 0x90, 0x00, 0x03, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0xB8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, "?".repeat(0x3c - 0x19), { offset: 0x3c }, 0xF8, 0, 0, 0, { offset: 0xf8 }, 'PE\0\0', ], desc: [ [2, 'Magic', 'MZ'], [4, 'Pointer to PE', 0xF8, { offset: 0x3C }], [4, 'PE signature', 'PE\\0\\0', { offset: 0xF8 }], ], highlights: [ [0x19, 0x23], ], From data to visualisation. 37

Slide 38

Slide 38 text

Rendering -> JSON -> (actual input) JavaScript code -> (optional, but easier to type than pure JSON) …generates… 38

Slide 39

Slide 39 text

39

Slide 40

Slide 40 text

It's not just a picture + Vector image: + One line = 2 nodes. + Infinite zoom at any scale. + Vector bitmap font. + Text remains selectable. + Embeddable images. - No embedded font. + XML: + typed manually or generated automatically. + can be pre/post processed. 40 Zoom at any scale. No blur, no pixels.

Slide 41

Slide 41 text

SVG is there to stay! It’s a standard vector format. - browser, viewers, editors… - can be converted to PDF (no CSS, no filters) Just “Print, Save as PDF” -> embedded font glyphs in the document. - can be rendered as PNG of any (fixed) dimension. (no more resizing!) - can be converted to EMF (Google Docs vector format). 41

Slide 42

Slide 42 text

- In the browser, w/ JavaScript - XML manipulation - Inkscape or any other software… PostProcess an SVG ? My workshop slides on Inkscape -> 42

Slide 43

Slide 43 text

Your input Just the important data. - binary contents as ‘compact hexii’. - structures description. - optional highlights (‘where my signature hits’). Not required to document every byte. -> don't flood your audience with secondary content. (unlike hex viewer screenshots) 43

Slide 44

Slide 44 text

Compact HexII Text mixed with byte values as integer (ASCII only). 'MZ', 0x90, 0x00, 0, "?".repeat(0x3c - 0x19), { offset: 0xf8 }, 'PE\0\0', "MZ",144,0,0,"???????????????????????????????????", {"offset":248},"PE\u0000\u0000" JSON conversion -> Example of input JS code -> 44

Slide 45

Slide 45 text

Create SVGs without any dependency ? E const xmlns = 'http://www.w3.org/2000/svg'; const svgEl = document.createElementNS(xmlns, 'svg'); const divEl = document.getElementById('svgdiv'); divEl.appendChild(svgEl); const text = document.createElementNS(xmlns, 'rect'); text.textContent = 'E'; text.setAttribute('x', `50`); text.setAttribute('y', `80`); text.setAttribute('fill', "#FF0000"); text.setAttribute('font-size', `80px`); text.setAttribute('font-family', 'serif'); text.setAttribute('text-anchor', 'middle'); svgEl.appendChild(text); 45 Minimal XML ->

Slide 46

Slide 46 text

✅ fits a need ✅ adapts to different styles 46 SBuD v4 XKCD 1205: is it worth the time?

Slide 47

Slide 47 text

"But JavaScript / browsers s*ck!" The lowest entry environment to provide something graphical without any required setup. No required setup, no excuse. Alternative solutions were tried: manual SVG generation, CairoSVG… Browsers are a nice compromise: open the webpage, type, save. …and at least, it's ts -check JavaScript, with types :) 47

Slide 48

Slide 48 text

"But JSON s*cks !!" Indeed! (no hexadecimal numbers !) But it’s the lowest common denominator (unlike JSON5…) “Anything” can generate JSON. Native Javascript can also be directly used. SyntaxError: Expected ',' or ']' after array element in JSON at position... 48

Slide 49

Slide 49 text

.JSON files ⊊ JS object notation No hex, no template literals From Javascript object to JSON: > o = {hex:[`{3*2}`, 0x20]} > o.hex <- (2) [‘6’, 32] > typeof(o) <- ‘object’ > JSON.stringify(o) <- ‘{“hex”: [“6”, 32]}’ Properties as strings only: > JSON.parse('{hex:[]}') <- Uncaught SyntaxError: Expected property name o > JSON.parse('{“hex”:[]}') <- {hex: Array(0)} No hex: > JSON.parse('{“hex”:[0x90]}') <- Uncaught SyntaxError: Expected ‘,’ or ‘]’ afte 49

Slide 50

Slide 50 text

From parsers to SBuD It's just JSON. It can be easily generated from any dissector without any dedicated library. 50 # Usage: # fq -L . 'include "to_sbud"; to_sbud' format/gif/testdata/4x4.gif | pbcopy def to_sbud: ( [.. | to_entries?[]] as $entries | { hexii: [ $entries[].value | scalars | tobytes? | explode[] | if . >= 33 and . <= 126 then [.] | implode end ] , descriptions: [ $entries[] as {$key, $value} | $value | scalars | [ tobytesrange.size , ($key | tostring) , if type != "string" then tojson end ]? ] } ); SBuD output hack for FQ

Slide 51

Slide 51 text

SBuD status Still experimental. No tests, no fuzzing. Still very early. Everything might evolve. But already useful! And nice and reusable visualisation is addictive! 51

Slide 52

Slide 52 text

Problem: which font is present ? - Enumerating local fonts is a privacy risk - Only supported by Chrome and Safari -> Render a text with a given font family then see if the dimensions changed 󰤅 52

Slide 53

Slide 53 text

What's next (Actual product(s) may differ) 53

Slide 54

Slide 54 text

Source Text w/ a simple keyword colorizer w/ wrapping! Line numbers, flow. Extra arrows and annotations. Use cases: - describeBlock("", "", "HTML body"); - arrow(endLine(line("XREF" + 1), "1 0 obj"); 54

Slide 55

Slide 55 text

What else…? 1/2 55

Slide 56

Slide 56 text

What else…? 2/2 56

Slide 57

Slide 57 text

Conclusion 57

Slide 58

Slide 58 text

DOS Signature Offset to PE Header PE signature No more ugly screenshots! 58

Slide 59

Slide 59 text

hexii: [ 'MZ', 0x90, 0x00, 0x03, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0xB8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, "?".repeat(0x3c - 0x19), { offset: 0x3c }, 0xF8, 0, 0, 0, { offset: 0xf8 }, 'PE\0\0', ], descriptions: [ [0, 'Dos header'], [2, 'Magic', 'MZ'], { offset: 0x3C }, [4, 'Pointer to PE', 0xF8], { offset: 0xF8 }, [0, 'PE Header'], [4, 'PE signature', 'PE\\0\\0'], ], highlights: [ [0x19, 0x23], ], Minimum information for a nice rendering 59

Slide 60

Slide 60 text

SBuD 60 https://corkami.github.io/sbud/hexii.html Hex viewing. Type, don't screenshot. Bring your own theme!

Slide 61

Slide 61 text

It's about information. Clear information generation w/ your theme, branding. Dynamic information: generate, update, reuse. 61 It's not about shiny pixels.

Slide 62

Slide 62 text

- Print [to PDF] with no blank space. - Reliable font metrics in the browser. - modules <-> server-less use. - vanilla JS fuzzing + testing. 62 Open challenges

Slide 63

Slide 63 text

Title screen Special thanks to: Phillippe Teuwen, Rafał Hirsch, Mattias Wadman (Jq/Fq), WerWolv (ImHex). Thank you! Any feedback is welcome! 63

Slide 64

Slide 64 text

Wheel of fortune / crosswords Introduce important acronyms (add their initials) 64 To be continued 1/3

Slide 65

Slide 65 text

Diagrams I wish I could have automated Defeating the E7 protection PoCorGTFO 11:05 Pokemon plays twitch PoCorGTFO 10:03 Annotated dissassembly Decorated grid 65 To be continued 2/3

Slide 66

Slide 66 text

Reverse engineering Star Raiders PoCorGTFO 13:2 (2016) 66 To be continued 3/3

Slide 67

Slide 67 text

Worth checking Cantor Dust, Veles, 101 Editor,Synalyze, Poke, FQ, Hiew, Hobbits, Pixd. Recommended: Kaitai, ImHex and FQ. - fq ddv -M -o line_bytes=16 - fq "tovalue | walk((scalars | {name: ._name, range: (tobytesrange | [.start, .stop]), value: .}) //.)" -M Useful invocations: 67

Slide 68

Slide 68 text

Size and alignments? It could have been be easy: 00+2 Dos signature Another example (same fontsize, same line): 00+2 Dos signature 68 Hurdles 1/4

Slide 69

Slide 69 text

Same Font size, same character MMMMMMMMMMMMM MMMMMMM MMMMMMMMMMMMMMM MMMMMMMMMM MMMMMMMMMMMM MM 69 Hurdles 2/4

Slide 70

Slide 70 text

Problem: alignment - JavaScript can’t get all fonts metrics: (Only the graphical dimensions) Sometimes, it's just buggy. Some fonts are unusable (why?) -> Extract and store metrics via an external script? 70 Hurdles 3/4

Slide 71

Slide 71 text

Same characters, same font size And yet a different height, different line space… 71 Hurdles 4/4

Slide 72

Slide 72 text

Variety in color themes - Light/dark mode - Palettes: sequential, qualitative, with highlights & shadows, monochrome, grayscale, b&w. - Some themes directly defines programming use-case. -> a custom mapping is necessary. - Use cases: e-readers, color blindness. 72 Themes 1/2

Slide 73

Slide 73 text

Themes 73 Qualitative Code-oriented Accents Monotone Themes 2/2

Slide 74

Slide 74 text

What about a custom syntax like Mermaid ? erDiagram CUSTOMER }|..|{ DELIVERY-ADDRESS : has CUSTOMER ||--o{ ORDER : places CUSTOMER ||--o{ INVOICE : "liable for" DELIVERY-ADDRESS ||--o{ ORDER : receives INVOICE ||--|{ ORDER : covers ORDER ||--|{ ORDER-ITEM : includes PRODUCT-CATEGORY ||--|{ PRODUCT : contains PRODUCT ||--o{ ORDER-ITEM : "ordered in" Mermaid diagram syntax Generated Mermaid diagram 74 Other syntaxes 1/2

Slide 75

Slide 75 text

Mermaid-like syntax Your data as will be stuck with its custom format. Json can be easily [re-]generated, parsed… You can type your own JSON type JS and convert it as JSON. SVG is alive. JSON is alive. A custom syntax is ok, but to be converted to JSON anyway. 75 Other syntaxes 2/2

Slide 76

Slide 76 text

Terminal output experiments Give text mode some love… 76 Terminal 1/6

Slide 77

Slide 77 text

w/ standard colors HexII (Cli tool) (2020) corkami/src/HexII 77 Terminal 2/6

Slide 78

Slide 78 text

w/ RGB colors and styles (bad compatibility) 78 Terminal 3/6

Slide 79

Slide 79 text

More visual experiments Text compact mode w/ alternate colors 79 Terminal 4/6

Slide 80

Slide 80 text

Experiences w/ Braille Codepage are useless when it’s not text. ASCII and braille never align properly! 80 Terminal 5/6

Slide 81

Slide 81 text

Lessons learned Some funky experiments. Codepages are here to stay ? Not the best to determine patterns. Terminal forces a different view. 81 Terminal 6/6

Slide 82

Slide 82 text

Bugs & workarounds Inkscape doesn't use , but only . Whitespace preservation: - white-space: pre; in CSS for browser, - xml:space="preserve" for Inkscape. 82

Slide 83

Slide 83 text

Fonts you might like Oldschool: - The Ultimate Oldschool PC Font Pack - Retro computing fonts by Kreative Software Handwritten: Patrick Hand, Aracne 83