Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zip: Data compression (20분만에 배우는 압축 알고리즘)

Zip: Data compression (20분만에 배우는 압축 알고리즘)

Run-Length Encoding, Huffman Code, LZ77, LZ78, LZW, Deflate
Video: https://www.youtube.com/watch?v=Yc_orrKXn1I

Leonardo YongUk Kim

November 25, 2020
Tweet

More Decks by Leonardo YongUk Kim

Other Decks in Programming

Transcript

  1. Zip RLE ࠗఠ Deflateө૑ • RLE (Run-Length Encoding) • Huffman

    Code • LZ77 (LZ1) • LZ78 (LZ2) • LZW • LZSS • Deflate
  2. RLE (Run-Length Encoding) Ҋ؀ Ӓې೗ਸ ਤೠ ੋ௏٬ • BMP, RLE

    ١੄ Ҋ؀ ੋ௏٬ਸ ਤ೧ ࢎਊ. • োࣘػ زੌೠ Ӗ੗ܳ പࣻ۽ ୷ড. • োࣘػ э਷ Ӗ੗о হਵݶ ബਯ੉ ڄয૗. A A A D D C C A 5 2 D 2 C 1 E E A A
  3. Huffman Code ূ౟۽ೖ ࠗഐച (Entrophy Encoding) • David A. Huffmanо

    ݅ٞ. • बࠅ੉ ୹োೡ ഛܫী ٮۄ बࠅী ࢎਊೞח ࠺౟ ࣻܳ ઴੐.
 (ূ౟۽ೖ ࠗഐച)
  4. Huffman Code ূ౟۽ೖ ࠗഐച A A A D D C

    C E A A D बࠅ ഛܫ A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1)
  5. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 C 0.18

    E 0.09 बࠅ ഛܫ A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1) ഛܫਸ ӝ߈ਵ۽ য়ܴରࣽਵ۽ ੿ܻ.
  6. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 C 0.18

    E 0.09 बࠅ ഛܫ A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1) ſ ծ਷ ഛܫ فѐܳ ࢶఖೠ׮.
  7. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 बࠅ ഛܫ

    A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1) E 0.09 C 0.18 0.27 0.27 ծ਷ ഛܫ فѐܳ ࢶఖೞৈ ౟ܻܳ ٜ݅Ҋ ೤࢑ػ ഛܫਸ ܻझ౟ী ୶о.
  8. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 बࠅ ഛܫ

    A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1) E 0.09 C 0.18 0.27 0.27 ſ ծ਷ ഛܫ فѐܳ ࢶఖೠ׮. (ч੉ э׮ݶ ֫੉о ծ਷ ଃ੉ ৽ଃਵ۽)
  9. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 बࠅ ഛܫ

    A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1) E 0.09 C 0.18 0.27 0.54 0.54 ծ਷ ഛܫ فѐܳ ࢶఖ೧ࢲ ౟ܻܳ ഛ੢ೞҊ ೤࢑ػ ഛܫਸ ܻझ౟ী ׮द ୶о.
  10. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 बࠅ ഛܫ

    A 0.45 (5) C 0.18 (2) D 0.27 (3) E 0.09 (1) E 0.09 C 0.18 0.27 0.99 0.54
  11. Huffman Code ূ౟۽ೖ ࠗഐച A 0.45 D 0.27 बࠅ ௏٘

    A 0 (0.45) C 111 (0.18) D 10 (0.27) E 110 (0.09) E 0.09 C 0.18 0.27 0.99 0.54 0 0 0 1 1 1 ৽ଃ ֢٘ী 0, য়ܲଃ ֢٘ী 1ਸ ࠢੋ׮. ࠼بо ֫ਸ ࣻ۾ ੸਷ ࠺౟.
  12. LZ77 (LZ1) Lempel & Ziv 77 • Abraham Lempel৬ Yaakov

    Zivо ݅ٞ. • 1977֙ী ٜ݅য LZ77, ୐ ߣ૩ۄࢲ LZ1. • ઺ࠂػ ޙ੗ৌਸ ઁѢ. • ठۄ੉٬ ਦب਋о ౠ૚. • ߈ࠂغח ޙ੗ৌ੉ ठۄ੉٬ ਦب਋ࠁ׮ ӡ ҃਋ ঑୷ೡ ࣻ হ਺.
  13. Sliding Window Window, Search Buffer H e l l A

    B H e l l o Window, Search Buffer, Sliding Window View • Windowח Ѩ࢝ ؀࢚, Viewח ੉ߣী ୊ܻ೧ঠೡ ղਊ. • Viewীࢲ ୊ܻо ՘դ ؘ੉ఠח Window۽ ੉ز.
  14. Sliding Window Window, Search Buffer H e l l A

    B H e l l o Window, Search Buffer, Sliding Window View • View੄ খࠗ࠙ Hellҗ Ҁ஖ח Hellਸ Windowীࢲ ଺਺.
  15. LLD Literal, Length, Distance H e l l A B

    H e l l o Distance = 6 (Viewীࢲ ࠗఠ Ѣܻ) Length = 4 (ੌ஖ೞח ޙ੗ৌ੄ ӡ੉) Literal = ‘o’ (ੌ஖ೞ૑ ঋח ୐ Ӗ੗) (6 , 4, ‘o’) • LZ77 - LLD ౚ೒੄ ݾ۾ਸ ݅٘ח ੌ.
  16. LLD Literal, Length, Distance H e l l A Distance

    = 0 Length = 0 Literal = ‘H’ (0 , 0, ‘H’)
  17. LLD Literal, Length, Distance H e l l A B

    Distance = 0 Length = 0 Literal = ‘e’ (0 , 0, ‘H’) (0 , 0, ‘e’)
  18. LLD Literal, Length, Distance H e l l A B

    H Distance = 0 Length = 0 Literal = ‘l’ (0 , 0, ‘H’) (0 , 0, ‘e’) (0 , 0, ‘l’)
  19. LLD Literal, Length, Distance H e l l A B

    H e Distance = 1 Length = 1 Literal = ‘A’ (0 , 0, ‘H’) (0 , 0, ‘e’) (0 , 0, ‘l’) (1 , 1, ‘A’)
  20. LLD Literal, Length, Distance H e l l A B

    H e l l Distance = 0 Length = 0 Literal = ‘B’ (0 , 0, ‘H’) (0 , 0, ‘e’) (0 , 0, ‘l’) (1 , 1, ‘A’) (0 , 0, ‘B’)
  21. LLD Literal, Length, Distance H e l l A B

    H e l l o Distance = 6 Length = 4 Literal = ‘o’ (0 , 0, ‘H’) (0 , 0, ‘e’) (0 , 0, ‘l’) (1 , 1, ‘A’) (0 , 0, ‘B’) (6 , 4, ‘o’)
  22. LZ78 (LZ2) Lempel & Ziv 78 • Abraham Lempel৬ Yaakov

    Zivо ݅ٞ. • 1978֙ী ٜ݅য LZ78, ف ߣ૩ۄࢲ LZ2. • ݫݽܻী ࢎ੹ਸ ਬ૑. ۠ఋ੐ ݫݽܻо ޙઁ. • ੼૓੸ਵ۽ о੢ ӟ ಁఢী ࢜ बࠅਸ ؔࠢৈ ࢎ੹ਸ јन.
  23. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o • 0਷ ࢎ੹ী হ׮ח ੄޷. • ‘H’ ޙ੗ৌ਷ ࢎ੹ী ୶о. (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H
  24. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o • 0਷ ࢎ੹ী হ׮ח ੄޷. • ‘e’ ޙ੗ৌ਷ ࢎ੹ী ୶о. (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H 2 e (0, ‘e’)
  25. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o • 0਷ ࢎ੹ী হ׮ח ੄޷. • ‘l’਷ ࢎ੹ী ୶о. (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H 2 e 3 l (0, ‘e’) (0, ‘l’)
  26. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o • 3਷ ࢎ੹੄ ‘Ƙ’, Ӓ ׮਺਷ ‘A’. • “lA”ܳ ࢎ੹ী ୶о (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H 2 e 3 l 4 lA (0, ‘e’) (0, ‘l’) (3, ‘A’)
  27. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o • 0਷ ࢎ੹ী হ׮ח ੄޷. • ‘B’ח ࢎ੹ী ୶о. (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H 2 e 3 l 4 lA 5 B (0, ‘e’) (0, ‘l’) (3, ‘A’) (0, ‘B’)
  28. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H 2 e 3 l 4 lA 5 B 6 He (0, ‘e’) (0, ‘l’) (3, ‘A’) (0, ‘B’) (1, ‘e’)
  29. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o (0, ‘H’) ੋؙझ ޙ੗ৌ 1 H 2 e 3 l 4 lA 5 B 6 He 7 ll (0, ‘e’) (0, ‘l’) (3, ‘A’) (0, ‘B’) (1, ‘e’) (3, ‘l’)
  30. LZ78 (LZ2) Lempel & Ziv 78 H e l l

    A B H e l l o (0, ‘H’) (0, ‘e’) (0, ‘l’) (3, ‘A’) (0, ‘B’) (1, ‘e’) (3, ‘l’) (0, ‘o’) ੋؙझ ޙ੗ৌ 1 H 2 e 3 l 4 lA 5 B 6 He 7 ll
  31. LZW

  32. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) • Lempel, Ziv, Terry Welch

    (1984) • LZ78ਸ ঑୷ ബਯҗ ࣘبܳ ݽف Ҋ۰ೞৈ ѐࢶ. • ਊ۝੉ ୭੸਷ ইש. • GIF١੄ ನݘী ࢎਊؽ. (Compuserve) • ֔झா੉೐о Animated GIFܳ ҳഅೞৈ ৔ࢤೞѱ ؽ. • Unisys ١੉ ౠೲܳ ઱੢ೞৈ द੢ীࢲ ؀୓. (GIF -> PNG)
  33. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C • द੘ೡ ٸ Pܳ ୐ޙ੗۽, Cח ೦࢚ ࢜۽ ߉ইৡ ޙ੗۽. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃ H ੋؙझ ޙ੗ৌ 256 He
  34. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃ He ੋؙझ ޙ੗ৌ 256 He 257 el
  35. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C Hel ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃
  36. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C Hell ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃
  37. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C HellA ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA 260 AB • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃
  38. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C HellAB ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA 260 AB 261 BH • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃
  39. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C HellAB ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA 260 AB 261 BH • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী ੓ਵݶ, • Pܳ P+C۽ ߸҃
  40. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C HellAB(256) ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA 260 AB 261 BH 262 Hel • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃ ѾҴ ೠ बࠅ׼ 9 ࠺౟о ؽ.
  41. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C HellAB(256) ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA 260 AB 261 BH 262 Hel • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী ੓ਵݶ, • Pܳ P+C۽ ߸҃
  42. LZW LZ78੄ ѐࢶ. (ౠೲ Ҧޛ) H e l l A

    B H e l l o P C HellAB(256)(258)o ੋؙझ ޙ੗ৌ 256 He 257 el 258 ll 259 lA 260 AB 261 BH 262 Hel 263 llo • Cח ೦࢚ ࢜۽ оઉ১. • P + Cо ࢎ੹ী হ׮ݶ, P+Cܳ ࢎ੹ী ֍Ҋ, • Pܳ ୹۱ೞҊ, • Pܳ C۽ ߸҃
  43. LZSS ѐࢶػ LZ77 • James A. Storer, Thomas Szymanski
 (ࢎ૓਷

    James݅. ࢎ૓ਸ ҳೞ૑ ޅೣ. Thomas દ࣠.) • 1982֙ী ٜ݅য૗. • ؀୓ৌ੉ ӡ ҃਋ ਗࠄਵ۽ ӝ۾. • ؀ࠗ࠙੄ ঑୷ ో(ZIP, ARJ, RAR, ZOO, LHArc١)ীࢲ ࢶఖ.
  44. LZSS Literal, Length, Distance H e l l A B

    H e l l o Length = 4 Literal = ‘o’ (6 , 4) HellAB o ؀୓ ޙ੗ৌ੉ ӟ ҃਋ ਗې ޙ੗ৌਸ Ӓ؀۽ ࢎਊ.
  45. Deflate LZSS + Huffman Code • Phillip Walter Katzী ੄೧

    PKZIPਸ ਤ೧ ѐߊ. (1993֙) • RFC 1951۽ ಴ળച. (1996) • gzip, zip, png, zlib, pkzip, 7-zip, jar, ࢎप ࢚੄ ਘ٘ ಴ળ. • ࣁ࢚ਸ ߄Բ঻૑݅ PKח ঌ௒ ઺ةਵ۽ 30؀ী ਃ੺.
  46. Deflate LZSS + Huffman Code • LZSS۽ ঑୷ػ Ѿҗী ؀೧

    Huffman Codeܳ ੸ਊ. • ઺ࠂਸ ઁѢ റ, п बࠅ੄ ࠺౟ܳ ઴੉ח ੹ۚ. • فѐ੄ ೲ೐݅ ௏٘ܳ ࢎਊ. • L Tree - Literal (ޙ੗, 0-255), Length (ӡ੉ 3-258)ী ؀೧ ೲ೐݅ ௏٬ਸ ੸ਊೣ. • D Tree - Distance (Ѣܻ 3-୭؀ч)ܳ ׮ܖӝ ਤೠ ప੉࠶.
  47. Deflate LZSS + Huffman Code • WebP৬ э਷ അ؀ചػ ӝࣿ

    • ӝࠄ੸ਵ۽ח Deflate੄ ੹ۚ. • ୶о੸ੋ ѐࢶ ੹ۚ + ঌҊ્ܻ