Slide 1

Slide 1 text

08/04/2022 Ange Albertini G 2 D 2 E 1 T 1 A 1 E 1 B 3 G 2 Z 10 I 1 P 3 L 1 I 1 F 4 Relations between archive formats

Slide 2

Slide 2 text

G 2 D 2 E 1 T 1 A 1 E 1 B 3 G 2 Z 10 I 1 P 3 L 1 I 1 F 4 A presentation by A.K.A. Ange Albertini

Slide 3

Slide 3 text

- reverse engineering since 1989 - author of Corkami - File Formats For Ever at PoC or GTFO - malware analysis - infosec engineer About the author my license plate is a CPU, my phone case is a PDF doc, my PDF resume is a SNES/MD rom. My own views and opinions. 3

Slide 4

Slide 4 text

Tl:Dr; A lot of confusion regarding Zlib/Gzip/Zip/Deflate. Is Deflate “Zip’s algorithm” ? This deck is not about explaining compression algorithms. THE CURRENT SLIDE IS AN A CORKAMI ORIGINAL PRODUCTION HONEST TALK TRAILER zlib — Compression compatible with gzip 4

Slide 5

Slide 5 text

Standards timeline 1989-2020 Zip file format (AppNote) 1996/05 - RFC 1950 - Zlib Compressed Data Format Specification 1996/05 - RFC 1951 - Deflate Compressed Data Format Specification 1996/05 - RFC 1952 - Gzip file format Zip is much older. All related RFCs were submitted together, which is confusing. 5

Slide 6

Slide 6 text

Zip timeline Supported Compressions (cf AppNote archive) 1990 v1.0 {Store, Shrunk, Reduce1/2/3/4, Implode} 1993 v2.0 +{Tokenize, Deflate} 2001 v4.5.0 +{Deflate64, Imploding} 2003 v5.2.0 +{Res11, Bzip2} 2006 v6.3.0 +{Res13, LZMA, Res15-17, IBM Terse, Lz77, PPMd} 2020 v6.3.9 +{Zstd, Mp3, XZ, Jpeg, WavPack} CF Hans Wennborg blog post Deflate: CompressionMethod = 8 8 6 0 1 2 3 4 5 6 8 7

Slide 7

Slide 7 text

Zip supports a lot more than Deflate Since 1992, Deflate is ZIP’s standard ‘generic’ compression. Some tools only support Deflate (and reject other methods): -> using older compressions is an easy security bypass. 7

Slide 8

Slide 8 text

Ok, we know that Deflate is one of Zip’s algorithm The standard one 8

Slide 9

Slide 9 text

Let’s not deep-dive into Deflate Let’s just pick one example 9

Slide 10

Slide 10 text

The minimal Deflate stream Deflate stream of an empty stream: Tiny, but already complex for empty data! 03 00 Deflate data: - Last/Type - Length True/Dynamic Huffman 0 00 01 01 00 00 FF FF Deflate data: - Last/Type - Length - !Length True/No Compression 0 -1 00 01 02 03 04 Compressed form Raw form 10

Slide 11

Slide 11 text

Zip Store method Pure raw data - the original file as-is. (useful to keep payloads still useable) Zip Storing is not the same as Deflate with no compression. Last/Type Length !Length True/NC 0 0xFFFF Zip-Stored empty string “” Deflate-stored empty string: 01 00 00 FF FF The other standard ZIP method. “No Compression”. 11

Slide 12

Slide 12 text

What about Gzip and Zlib? 12

Slide 13

Slide 13 text

A minimal Zlib stream (simplif ied) 78 DA 03 00 00 00 00 01 00 01 02 03 04 05 06 07 [4 bits] Method [1 byte] Deflate data [4 bytes] Simplified contents: - Some parameters - including the Compression Method - Deflate data - a footer Always 2 bytes before, 4 bytes after. 13

Slide 14

Slide 14 text

A minimal Zlib stream 78 DA 03 00 00 00 00 01 00 01 02 03 04 05 06 07 Window Size Method Flags Checksum Deflate data: - Last/Type - Length Adler32 7 = 32Kb 8 = Deflate No Dictionary Extra 0x78DA % 31 = 0 True/Dynamic Huffman 0 0x00000001 CM (Compression method) This identifies the compression method used in the file. CM = 8 denotes the "deflate" compression method with a window size up to 32K. This is the method used by gzip and PNG (see references [1] and [2] in Chapter 3, below, for the reference documents). CM = 15 is reserved. It might be used in a future version of this specification to indicate the presence of an extra field before the compressed data. 14

Slide 15

Slide 15 text

0x 1x 1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00 00 00 00 00 0 1 2 3 4 5 6 7 8 9 A B C D E F [2 bytes] Compression Method [variable] Deflate data [8 bytes] A minimal Gzip archive Compression method is always 08 (Deflate). 15 1F 8B 8 = Deflate CM (Compression Method) This identifies the compression method used in the file. CM = 0-7 are reserved. CM = 8 denotes the "deflate" compression method, which is the one customarily used by gzip and which is documented elsewhere.

Slide 16

Slide 16 text

In details… 0x 1x 1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00 00 00 00 00 0 1 2 3 4 5 6 7 8 9 A B C D E F Magic Method Flags ModTime Extra Flags OS Deflate data: - Last/Type - Length CRC32 lenUncomp Some fixed length information is required before and after the Deflate data. FileName, Comments, Extra Field are variable and optional (not used here). 16 1F 8B 8 = Deflate None 0/0/1980 00:00 Max compression Unknown True/Dynamic Huffman 0 0x00000000 0

Slide 17

Slide 17 text

Zlib <-> Gzip 2 different ways to store a Deflate data stream. Both with data before and after. The compressed data can be tranferred, but both formats aren’t compatible. 17

Slide 18

Slide 18 text

78 DA 03 00 00 00 00 01 0 1 2 3 4 5 6 7 [4 bits] Method [1 byte] Deflate data [4 bytes] 8 = Deflate 0x 1x 1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00 00 00 00 00 0 1 2 3 4 5 6 7 8 9 A B C D E F [2 bytes] Method [variable] Deflate data [8 bytes] 8 = Deflate Zlib data stream GZip “member” Deflate data 18

Slide 19

Slide 19 text

78 DA 03 00 00 00 00 01 0 1 2 3 4 5 6 7 Window Size Method Flags Checksum Deflate data: - Last/Type - Length Adler32 7 = 32Kb 8 = Deflate No Dictionary Extra 0x78DA % 31 = 0 True/Dynamic Huffman 0 0x00000001 0x 1x 1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00 00 00 00 00 0 1 2 3 4 5 6 7 8 9 A B C D E F Magic Method Flags ModTime Extra Flags OS Deflate data: - Last/Type - Length CRC32 lenUncomp 1F 8B 8 = Deflate None 0/0/1980 00:00 Max compression Unknown True/Dynamic Huffman 0 0x00000000 0 Zlib data stream GZip “member” Deflate data 19

Slide 20

Slide 20 text

Signature . MadeVersion NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileCommentLen DiskNumberStart InternalAttr ExternalAttr LFHOffset . FileName . ExtraField FileComment 00 04 06 08 0A 0C 0E 12 16 1A 1C 1E 27 34 +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F P K 05 06 00 00 00 00 00 00 01 00 33 00 00 00 25 00 00 00 00 00 Start PK\3\4 . 10 None 8=Deflate . 00:00 0/0/1980 0x00000000 2 . 0 . 5 . 0 empty . 03 00 . n/a Signature . NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileName . Content . ExtraField 4 2 2 2 2 2 4 4 4 2 2 ? ? ? P K 03 04 0A 00 00 00 08 00 00 00 00 00 00 00> <00 00 02 00 00 00 00 00 00 00 05 00 00 00 e m> < p t y 03 00 34 38 3A 3C 3E 40 42 44 48 4C 50 52 54 56 58 5A 5E 62 6B 6B PK\1\2 . 0 10 None 8=Deflate . 00:00 0/0/1980 0x00000000 2 . 0 . 5 . 0 0 0 0 0 0 . empty . n/a n/a 4 2 2 2 2 2 2 4 4 4 2 2 2 2 2 4 4 ? ? ? PK\5\6 . 0 0 0 1 . 33 . 25 . 0 n/a 6B 6F 71 73 75 77 7B 7F 81 Signature . ThisDiskNumber StartDiskNumber ThisDiskEntries StartDiskEntries . Size . CDOffset . CommentLen Comment 4 2 2 2 2 4 4 2 ? 1. End of Central Directory 2. Central Directory 3. Local File Header A complete ZIP archive with empty deflated data 0x 1x 2x +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F 2x 3x 4x 5x 5x 6x P K 01 02 00 00 0A 00 00 00 08> <00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00> <00 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00> <00 00 00 e m p t y 20

Slide 21

Slide 21 text

Signature MadeVersion NeededVersion Flags CompMethod . ModTime ModDate CRC32 CompressSize UncompSize FileNameLen ExtraFieldLen FileCommentLen DiskNumberStart InternalAttr ExternalAttr LFHOffset FileName ExtraField FileComment 00 04 06 08 0A 0C 0E 12 16 1A 1C 1E 27 34 +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F P K 05 06 00 00 00 00 00 00 01 00 33 00 00 00 25 00 00 00 00 00 PK\3\4 10 None 8=Deflate . 00:00 0/0/1980 0x00000000 2 0 5 0 empty 03 00 . n/a Signature NeededVersion Flags CompMethod . ModTime ModDate CRC32 CompressSize UncompSize FileNameLen ExtraFieldLen FileName Content . ExtraField 4 2 2 2 2 2 4 4 4 2 2 ? ? ? P K 03 04 0A 00 00 00 08 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 05 00 00 00 e m p t y 03 00 34 38 3A 3C 3E 40 42 44 48 4C 50 52 54 56 58 5A 5E 62 6B 6B PK\1\2 0 10 None 8=Deflate . 00:00 0/0/1980 0x00000000 2 0 5 0 0 0 0 0 0 empty n/a n/a 4 2 2 2 2 2 2 4 4 4 2 2 2 2 2 4 4 ? ? ? PK\5\6 0 0 0 1 33 25 0 n/a 6B 6F 71 73 75 77 7B 7F 81 Signature ThisDiskNumber StartDiskNumber ThisDiskEntries StartDiskEntries Size CDOffset CommentLen Comment 4 2 2 2 2 4 4 2 ? 1. End of Central Directory 2. Central Directory 3. Local File Header Compression method and compressed data 0x 1x 2x +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F 2x 3x 4x 5x 5x 6x P K 01 02 00 00 0A 00 00 00 08> <00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e m p t y 21

Slide 22

Slide 22 text

Disambiguation Deflate is a compression algorithm. Zip usually uses Deflate, but not necessarily. Zlib and Gzip are both wrapping only Deflate, but in a different way. Same exchangeable data, but no direct compatibility. 22

Slide 23

Slide 23 text

Conclusion 23

Slide 24

Slide 24 text

3 different wrappers around Deflate Zlib GZIP member ZIP Local File Header Store Deflate64 Bzip2… Deflate

Slide 25

Slide 25 text

Conclusion Deflate is a very standard compression algorithm. Zip can use Deflate, but other algorithms too (Storing…) Zip can use a different compression per file. Zlib is a wrapper around a Deflate stream. A Gzip member is a wrapper around a Deflate stream. A Gzip file is one or more members. 25

Slide 26

Slide 26 text

Moving data around Deflate data can be moved from/to: - Zlib - Gzip - Zip using Deflate 2 bytes before // 4 bytes after. Variable header // 8 bytes after. 26

Slide 27

Slide 27 text

Thank you! Questions, suggestions… 27

Slide 28

Slide 28 text

Extra pictures 28

Slide 29

Slide 29 text

a ZIP archive with some stored content P K 05 06 00> <00 00 00 00 00 01 00 37 00 00 00 34 00 00 00 00> <00 Signature . MadeVersion NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileCommentLen DiskNumberStart InternalAttr ExternalAttr LFHOffset . FileName . ExtraField FileComment 00 04 06 08 0A 0C 0E 12 16 1A 1C 1E 27 34 +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F Start PK\3\4 . 10 None 0=Store . 00:00 0/0/1980 0x7D14DDDD . 13 . 13 . 9 . 0 hello.txt . Hello World\n . n/a Signature . NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileName . Content . ExtraField 4 2 2 2 2 2 4 4 4 2 2 ? ? ? P K 03 04 0A 00 00 00 00 00 00 00 00 00 DD DD> <14 7D 0D 00 00 00 0D 00 00 00 09 00 00 00 h e> < l l o . t x t H e l l o W o r> < l d ! \n 34 38 3A 3C 3E 40 42 44 48 4C 50 52 54 56 58 5A 5E 62 6B 6B PK\1\2 . 0 10 None 0=Store . 00:00 0/0/1980 0x7D14DDDD . 13 . 13 . 9 . 0 0 0 0 0 0 . hello.txt . n/a n/a 4 2 2 2 2 2 2 4 4 4 2 2 2 2 2 4 4 ? ? ? PK\5\6 . 0 0 0 1 . . 37 . 34 . 0 n/a 6B 6F 71 73 75 77 7B 7F 81 Signature . ThisDiskNumber StartDiskNumber ThisDiskEntries StartDiskEntries . . Size . CDOffset . CommentLen Comment 4 2 2 2 2 4 4 2 ? 1. End of Central Directory 2. Central Directory 3. Local File Header 0x 1x 2x 3x +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F 3x 4x 5x 6x P K 01 02 00 00 0A 00 00 00 00 00 00 00 00 00 DD DD 14 7D 0D 00 00 00 0D 00 00 00 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00> <00 00 h e l l o . t x t 6x 7x 8x

Slide 30

Slide 30 text

a ZIP archive with empty stored content Signature . MadeVersion NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileCommentLen DiskNumberStart InternalAttr ExternalAttr LFHOffset . FileName . ExtraField FileComment 00 04 06 08 0A 0C 0E 12 16 1A 1C 1E 23 23 +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F P K 05 06 00 00 00 00 00 00 01 00 33 00 00 00 23 00 00 00 00 00 Start PK\3\4 . 10 None 0=Store . 00:00 0/0/1980 0x00000000 0 . 0 . 5 . 0 empty . n/a n/a Signature . NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileName . Contents ExtraField 4 2 2 2 2 2 4 4 4 2 2 ? ? ? P K 03 04 0A 00 00 00 00 00 00 00 00 00 00 00> <00 00 00 00 00 00 00 00 00 00 05 00 00 00 e m> < p t y 23 27 29 2B 2D 2F 31 33 37 3B 3F 41 43 45 47 49 4D 51 56 56 PK\1\2 . 0 10 None 0=Store . 00:00 0/0/1980 0x00000000 0 . 0 . 5 . 0 0 0 0 0 0 . empty . n/a n/a 4 2 2 2 2 2 2 4 4 4 2 2 2 2 2 4 4 ? ? ? PK\5\6 . 0 0 0 1 . . 33 . 23 . 0 n/a 56 5A 5C 5E 60 62 66 6A 6C Signature . ThisDiskNumber StartDiskNumber ThisDiskEntries StartDiskEntries . . Size . CDOffset . CommentLen Comment 4 2 2 2 2 4 4 2 ? 1. End of Central Directory 2. Central Directory 3. Local File Header 0x 1x 2x +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F 2x 3x 4x 5x 5x 6x P K 01 02 00 00 0A 00 00 00 00 00 00> <00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 05> <00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00> <00 e m p t y

Slide 31

Slide 31 text

a ZIP archive with empty deflated content Signature . MadeVersion NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileCommentLen DiskNumberStart InternalAttr ExternalAttr LFHOffset . FileName . ExtraField FileComment 00 04 06 08 0A 0C 0E 12 16 1A 1C 1E 23 25 +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F P K 05 06 00 00 00 00 00 00 01 00 33 00 00 00 25 00 00 00 00 00 Start PK\3\4 . 10 None 8=Deflate . 00:00 0/0/1980 0x00000000 2 . 0 . 5 . 0 empty . 03 00 . n/a Signature . NeededVersion Flags CompMethod . ModTime ModDate CRC32 . CompressSize . UncompSize . FileNameLen . ExtraFieldLen FileName . Content . ExtraField 4 2 2 2 2 2 4 4 4 2 2 ? ? ? P K 03 04 0A 00 00 00 08 00 00 00 00 00 00 00> <00 00 02 00 00 00 00 00 00 00 05 00 00 00 e m> < p t y 03 00 25 29 2B 2D 2F 31 33 35 39 3D 41 43 45 47 49 4B 4F 53 58 58 PK\1\2 . 0 10 None 8=Deflate . 00:00 0/0/1980 0x00000000 2 . 0 . 5 . 0 0 0 0 0 0 . empty . n/a n/a 4 2 2 2 2 2 2 4 4 4 2 2 2 2 2 4 4 ? ? ? PK\5\6 . 0 0 0 1 . . 33 . 25 . 0 n/a 58 5C 5E 60 62 64 68 6C 6E Signature . ThisDiskNumber StartDiskNumber ThisDiskEntries StartDiskEntries . . Size . CDOffset . CommentLen Comment 4 2 2 2 2 4 4 2 ? 1. End of Central Directory 2. Central Directory 3. Local File Header 0x 1x 2x +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F 2x 3x 4x 5x 5x 6x P K 01 02 00 00 0A 00 00 00 08> <00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00> <00 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00> <00 00 00 e m p t y

Slide 32

Slide 32 text

A Gzip f ile (with a f ilename before the Deflate data) 32

Slide 33

Slide 33 text

Magic Method Flags ModTime Extra Flags OS Extra Field: - Size16 - SubField: - Type - Size16 - Data Filename - Data Comment - Data Deflate data: - Last/Type - Length - !Length - Data CRC32 lenUncomp 1F 8B 08 1C 26 F7 4F 62 00 FF 14 00 G Z 10 00 e x t r a \ f i e l d \ d a t a f i l e n a m e \0 c o m m e n t \0 01 0C 00 F3 FF H e l l o W o r l d ! A3 1C 29 1C 0C 00 00 00 Extra Field, Filename, Comment: set in Flags stored between OS and Deflate data. Filename & Comment: Null-terminated. Extra field: Size16 first, then SubFields 0x 1x 2x 3x 4x +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F 1F 8B 8 = Deflate Extra Field, Filename, Comment 1980/4/8 10:49 None Unknown 20 GZ 16 “extra field data” “filename\0” “comment\0” True/Raw 12 0xFFF3 Hello World! 0x1c291ca3 12 33 A full-featured GZIP 4 8 10 TEXT and CRC16 are not usually supported

Slide 34

Slide 34 text

A PNG image (PNG is an image format using Zlib) 34

Slide 35

Slide 35 text

One more thing… 35

Slide 36

Slide 36 text

How can you prove that it’s the same data? Make files that are both simultaneously, with the Deflate data in common 😱😉 ZGip: Zip/Gzip polyglots, with shared Deflate data. 36

Slide 37

Slide 37 text

∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ The End G 2 D 2 E 1 T 1 A 1 E 1 B 3 G 2 Z 10 I 1 P 3 L 1 I 1 F 4