Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Finding Synchronization Codes to Boost Compression by Substring Enumeration

Dany Vohl
October 08, 2012

Finding Synchronization Codes to Boost Compression by Substring Enumeration

Synchronization codes are frequently used in numerical data transmission and storage. Compression by Substring Enumeration (CSE) is a new lossless compression scheme that has turned into a new and unusual application for synchronization codes. CSE is an inherently bit-oriented technique. However, since the usual benchmark files are all byte-oriented, CSE incurred a penalty due to a problem called phase unawareness. Subsequent work showed that inserting a synchronization code inside the data before compressing it improves the compression performance. In this paper, we present two constraint models that compute the shortest synchronization codes, i.e. those that add the fewest synchronization bits to the original data. We find synchronization codes for blocks of up to 64 bits.

Dany Vohl

October 08, 2012
Tweet

More Decks by Dany Vohl

Other Decks in Science

Transcript

  1. Finding Synchronization Codes to Boost Compression by Substring Enumeration Dany

    Vohl Claude-Guy Quimper Danny Dubé Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration 1/73
  2. Dany Vohl, Synchronization Codes to Boost Compression by Substring Enumeration

    Introduction (1) • Synchronization codes frequently used in numerical data transmission & storage • i.e. When data reception is ill-synchronized • Recent work on data compression gives synchronization codes a new and unusual purpose • This work aims to find synchronization codes 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 2/73
  3. Dany Vohl, Synchronization Codes to Boost Compression by Substring Enumeration

    Introduction (1) • Synchronization codes frequently used in numerical data transmission & storage • i.e. When data reception is ill-synchronized • Recent work on data compression gives synchronization codes a new and unusual purpose • This work aims to find synchronization codes 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 3/73
  4. Dany Vohl, Synchronization Codes to Boost Compression by Substring Enumeration

    Introduction (1) • Synchronization codes frequently used in numerical data transmission & storage • i.e. When data reception is ill-synchronized • Recent work on data compression gives synchronization codes a new and unusual purpose • This work aims to find synchronization codes 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 4/73
  5. Dany Vohl, Synchronization Codes to Boost Compression by Substring Enumeration

    Structure of presentation 1. An application of synchronization codes 2. The new application : CSE 3. Characteristics of such codes 4. Constraint model 5. Pseudo-Boolean model 6. Experimental results 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 5/73
  6. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    1. An application of synchronization codes Hard drive 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 6/73
  7. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Overview Hard drive Spinning disk Read/Write head (RW) 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 7/73
  8. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Overview Hard drive Where does a track (or sector, or byte) start? 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 8/73
  9. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    2. Compression by Substring Enumeration : A Brief Description 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 9/73
  10. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    • CSE compresses by : • transmitting the number of occurrences of every possible substrings of bits. • Compression by Substring Enumeration : A Brief Description 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 10/73
  11. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description 0 1 0 0 0 0 0 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 11/73
  12. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description 0 1 0 0 0 0 0 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 12/73
  13. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description 0 1 0 0 0 0 0 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 13/73
  14. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description 0 1 0 0 0 0 0 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 14/73
  15. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description 0 1 0 0 0 0 0 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 15/73
  16. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description • Problem : • CSE is bit oriented while benchmarks are byte oriented • Unaware of phase of the bits within the byte 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 16/73
  17. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description • Solution suggested (Dubé, ISITA, 2010) : • Add control bits before compression to achieve strong synchronization • Substrings inside byte boundaries • Side effects : • Pre-compressed file is larger than original file • But is highly compressible 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 17/73
  18. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Compression by Substring Enumeration : A Brief Description • Solution suggested (Dubé, ISITA, 2010) : • Add control bits before compression to acheive strong synchronization • Substrings inside byte boundaries • Side effects : • Pre-compressed file is larger than original file • But is highly compressible 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 18/73
  19. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    3. Characteristics of our synchronization codes Phase, Synchronization and Reliability 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 19/73
  20. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Given this binary word: 0 1 0 1 0 1 0 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 20/73
  21. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Given this binary word: 0 1 0 1 0 1 0 1 0 1 2 3 4 5 6 7 With bit's phase: 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 21/73
  22. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability ? ? ? ? ? ? ? ? 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 If one lands on a random bit inside the word, would it be possible to determine the phase of this bit? Let say one reads a “1” … … 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 22/73
  23. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability This is where synchronization codes come handy 0 1 0 1 0 1 0 1 Original block: _ _ _ _ _ _ 0 _ _ 0 1 1 1 Synchronization code: Synchronized block: 0 1 0 1 0 1 0 0 1 0 1 1 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 23/73
  24. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Sequence of several synchronized blocks 0 1 0 1 0 1 0 1 _ _ _ _ _ _ 0 _ _ 0 1 1 1 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 1 1 1 0 0 0 1 1 _ _ _ _ _ _ 0 _ _ 0 1 1 1 1 1 1 0 0 0 0 1 1 0 1 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results … … … … … … 24/73
  25. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Given this (d,k,n)-synchronization code taken from the alphabet {0, 1, _ } 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 25/73
  26. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Given this (d,k,n)-synchronization code taken from the alphabet {0, 1, _ } Where d is # of data bits here, d=8 k is # of control bits here, k=20 n is the reliability here, n=7 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 26/73
  27. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Given this (d,k,n)-synchronization code taken from the alphabet {0, 1, _ } Where d is # of data bits here, d=8 k is # of control bits here, k=20 n is the reliability here, n=7 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 27/73
  28. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability Given this (d,k,n)-synchronization code taken from the alphabet {0, 1, _ } We obtain a (8,20,7)-synchronization code Where d is # of data bits here, d=8 k is # of control bits here, k=20 n is the reliability here, n=7 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 28/73
  29. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Synchronization code: Unknow phase in synchronized data: ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 29/73
  30. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 bit(s) read: 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 30/73
  31. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 2 ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 31/73
  32. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 3 ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 32/73
  33. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 4 ... ... 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 33/73
  34. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 5 ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 34/73
  35. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 6 ... ... 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 35/73
  36. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 7 ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 36/73
  37. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 7 ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 37/73
  38. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Phase, Synchronization and Reliability (8,20,7)-synchronization code : 7 reliable Where did we start in the synchronization code? bit(s) read: 7 ... ... 1 1 0 1 _ 1 _ 1 1 0 0 _ 0 _ 0 1 0 0 _ 0 _ 1 1 0 0 _ 1 _ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 38/73
  39. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    The considered (d,k,n)-Synchronization Codes 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 39/73
  40. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 40/73
  41. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code The phases range in {0, .., d+k-1} 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 41/73
  42. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Its rotations to the left The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 42/73
  43. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Its rotations to the left The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 43/73
  44. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Its rotations to the left The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 44/73
  45. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 9-reliability ensures two control bits conflict for any 2 lines in the first 9 columns 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 45/73
  46. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 9-reliability ensures two control bits conflict for any 2 lines in the first 9 columns 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 46/73
  47. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    9-reliability ensures two control bits conflict for any 2 lines in the first 9 columns The considered (d,k,n)-Synchronization Codes (8,10,9)-sync. code 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 47/73
  48. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    4. Finding Synchronization Codes : A Constraint Model 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 48/73
  49. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Constraint Model Variables • 1st model is built around 2k variables : • The position P of the control bit and • Its value V in a sequence C 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 49/73
  50. Finding Synchronization Codes : A Constraint Model Dany Vohl, Synchronisation

    Codes to Boost Compression by Substring Enumeration • Given 2 control bits in sequence C • Let A and B be these 2 control bits 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results A B 50/73
  51. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Constraint Model • Given 2 control bits in sequence C • Let A and B be these 2 control bits 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results A B B 51/73
  52. Finding Synchronization Codes : A Constraint Model Dany Vohl, Synchronisation

    Codes to Boost Compression by Substring Enumeration • and are the position of bits A and B in seq. C • At rotation i and j 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 52/73
  53. Finding Synchronization Codes : A Constraint Model Dany Vohl, Synchronisation

    Codes to Boost Compression by Substring Enumeration • and are the position of bits A and B inside lines i and j 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 53/73
  54. • Similarly, we have the values of A and B

    inside lines i and j Finding Synchronization Codes : A Constraint Model Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 54/73
  55. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    5. Finding Synchronization Codes : A Pseudo-Boolean Model 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 55/73
  56. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Pseudo-Boolean Model • 2 binary variables for d+k characters in C : • : is the i th bit in C a control bit? If so, • indicates if it is a 0 or a 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 56/73
  57. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Pseudo-Boolean Model 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 57/73
  58. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Pseudo-Boolean Model • 3rd variable: • When true (=1), the control bits i and g are 2 distinct bits at same (rotated) position with different values 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 58/73
  59. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Pseudo-Boolean Model 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 59/73
  60. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Pseudo-Boolean Model False 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 60/73
  61. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : A Pseudo-Boolean Model • Finally, we ensure that the sum of all is greater or equal to 1 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 61/73
  62. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    6. Finding Synchronization Codes : Experimental results 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 62/73
  63. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : Experimental results • Experiments executed in 2 parts • 1st : optimal k for cases where d = n • 2nd : smallest k for cases where d ≠ n 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 63/73
  64. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : Experimental results • The combination (PB model, minisat+) is faster than (CP model, Gecode) • (8,15,8)-Sync. Code : – CP (Gecode) : 5 min 48 sec – satisfiable – PB (Minisat+) : 0.31 sec – satisfiable • (8,14,8)-Sync. Code : – CP (Gecode) : 1 month – ??? – PB (Minisat+) : 3.77 sec – unsatisfiable 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 64/73
  65. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : Experimental results • The combination (PB model, minisat+) is faster than (CP model, Gecode) • (8,15,8)-Sync. Code : – CP (Gecode) : 5 min 48 sec – satisfiable – PB (Minisat+) : 0.31 sec – satisfiable • (8,14,8)-Sync. Code : – CP (Gecode) : 1 month – ??? – PB (Minisat+) : 3.77 sec – unsatisfiable 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 65/73
  66. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : Experimental results PB results for d = n Unsatisfiable Satisfiable 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 66/73
  67. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : Experimental results PB results for d = n Unsatisfiable Satisfiable 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 67/73
  68. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Finding Synchronization Codes : Experimental results PB results for d ≠ n : no such code Blank : no answer found within 18000 seconds integer : smallest value of k 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 68/73
  69. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Conclusion 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 69/73
  70. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Conclusion • New application of synchronization codes : • Compression by Substring Enumeration • Characteristics of synchronization codes • d data bits • k control bits • n reliable : – Maximum number of bits read before synchronization 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 70/73
  71. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Conclusion • 2 Models • CP model : O(k2+d2) variables and constraints – Domains : max(n,k) • PB model : O(k2+d2) variables and constraints – All domains have only 2 values (binary) 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 71/73
  72. Dany Vohl, Synchronisation Codes to Boost Compression by Substring Enumeration

    Conclusion • Found synchronization codes for words up to 64 bits when d = n • Found the minimal number of control bits when d ≠ n 1. Overview : 1st application 2. A 2nd application : CSE 3. Characteristics 4. Constraint Model 5. Pseudo-Boolean model 6. Experimental results 72/73