Breaking ALASKA

Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain Yassine
Yousﬁ, Jan Butora, Jessica Fridrich, and Quentin Giboulot 1 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

ALASKA challenges, the way we saw them Color JPEGs Variable
payload Multiple stego schemes Variable image size JPEG QFs 60–100 Ordering images instead of hard decisions 2 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Commitments and prior knowledge SRNet as a leading CNN architecture
[Boroumand TIFS’18], adapted to color using a 3-channel ﬁrst convolutional layer Training a “Tile Detector” and using it as a feature extractor to steganalyze arbitrary size images [Fuji Tsang EI’18] Multiclass detectors perform the best when dealing with diversiﬁed stego sources [Butora EI’19] Training detectors for each JPEG quality factor 3 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Datasets For each JPEG quality factor 256×256 “tiles” Arbitrary sized
images Base payload and double payload TRN / VAL / TST: 42,000 / 3,500 / 3,500 mixTST: 3,500 images, our “replica” of the ﬁnal test set 4 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Winning architecture, QF ≤ 98 Multiclass SRNet ⎛ ⎝ ⎜
⎜ ⎜ ⎜ ⎜ ⎜ 1 2 3 4 ⎞ ⎠ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Global average (512) Global variance (512) Global m inim um (512) Global m axim um (512) Concatenated Features (512x4x5) Hidden Layer (2x512x4x5) Hidden Layer (2x512x4x5) Concatenate 256x256 Tile Detector Arbitrary Size Detector ( ∗ ) DCT −1 YCrCb Y Cr Cb CrCb ⋯ ⋯ ⋯ ⋯ 5 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Color separation is the main trick Table: Color separation boost
for QF95 ARBITRARYbase Architecture MD5 PE Y CrCb-SRNet 48.13 24.51 Color separated SRNet 38.31 19.25 Merging colors in 1st layer using Y Cr Cb -SRNet appears sub-optimal Incorrect spread of payload among Y and Cr , Cb in the embedding script may have aﬀected the boost Color separation provides an even bigger boost when using the correct payload spread [Taburet IWDW’19] 6 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Multiclass vs binary results confirmed for JPEG domain [Butora EI’19]
results on diversified stego source are extended to JPEG domain steganography Bigger batch size gives better performance when facing diversified sources (batch size 64) Table: Y Cr Cb -SRNet trained as binary and multi-class for QF75 on TILEbase MD5 PE Binary 11.41 8.10 Multiclass 9.60 7.13 7 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Arbitrary sized images are steganalyzed using SRNets as feature extractors
[Fuji Tsang EI’18] results using modiﬁed YeNet [Ye TIFS’17] are extended to SRNet for JPEG domain steganography 4 “moments”: Mean, Variance, Minimum, and Maximum 2 hidden layers MLP with size (2 x input, 2 x input): non-linear decisions 8 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Winning architecture, QF 99: The reverse JPEG compatibility attack Binary
SRNet ( ) Global average (512) 256x256 Tile Detector Arbitrary Size Detector − [ ] ( ∗ ) DCT −1 Y 9 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

The reverse JPEG compatibility attack Rounding error eij = zij
− [zij ] follows N(0, s) “folded” to the interval [−1/2, 1/2]: ν(x; s) = 1 √ 2πs n∈Z exp − (x + n)2 2s . (1) For QF100, the variance s = 1/12 Steganographic embedding increases s 10 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

The reverse JPEG compatibility attack −0.5 0.0 0.5 0.6 0.7
0.8 0.9 1.0 1.1 1.2 1.3 1.4 1/12 0.1 0.15 0.2 Figure: Folded Gaussian distribution ν(x; s) for noise variance in the DCT domain s = 1/12, 0.1, 0.15, 0.2. Note how rapidly ν(x; s) converges to a uniform distribution with increased s 11 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

The reverse JPEG compatibility attack Near perfect detector for QF100
and very good performance for QF99 Helped us conﬁrm that ALASKArank had only 10% of stego images Robust to diﬀerent JPEG compressors Can detect arbitrary steganography and small payloads More details in J. Butora, J. Fridrich: “The Reverse JPEG Compatibility Attack” IEEE TIFS, 2019, under review Table: Reverse JPEG compatibility attack performance on ARBITRARYbase PE QF100 1.00 QF99 6.00 12 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Ordering ps computed using diﬀerent detectors: Calibration Comparing soft-outputs from
diﬀerent detectors should be done with extreme caution Essential property of a probability estimate: being a representative of the true correctness likelihood (calibration) Soft-outputs from deep nets, such as SRNet, often lack this property1 Shallow networks (MLP) typically well calibrated Arbitrary size SRNet is essentially a shallow network trained on a set of features: well calibrated 1Guo, Chuan, et al. "On calibration of modern neural networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017. 13 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Ordering ps computed using diﬀerent detectors: Calibration Figure: Calibration plot
for the tile detector and the arbitrary size detector for JPEG quality 95 14 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Final results Cover source mismatch impact: 400 double JPEG compressed
images ([Cogranne IH’19]) Table: Final scores on mixTST and ALASKArank and other competitors scores MD5 PE FA50 mixTST 18.55 11.50 0.09 ALASKArank 25.2 14.48 0.71 2nd competitor 51.60 25.20 5.86 15 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Bag of tricks: Curriculum Learning Two types of CL have
been used in Alaska Payload CL JPEG quality factor CL ([Butora EI’19]) Table: Tile detector performance on TILEbase with and without payload curriculum learning (payload) for quality factor 75 and 95 Without CL With CL MD5 PE MD5 PE QF75 15.69 10.09 9.60 7.12 QF95 95.00 50.00 13.80 9.29 16 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Bag of tricks, cont’d Warm start learning rate Using a
small learning rate (10−4 ) for a few iterations (around 20,000) before the original learning rate schedule described in [Boroumand TIFS’18] Stabilizes training and helps convergence Image augmentation at prediction Augmenting each test image with its rotations and ﬂips and averaging the soft-outputs of the detector over all transformations Boosts by 1-2% in MD5 17 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Low false alarm performance of SRNet 18 / 20 Breaking
ALASKA: Color Separation for Steganalysis in JPEG Domain

Looking back Alaska was very challenging and gave birth to
interesting research and discoveries “Wilder” than BOSS but still far from the real world Unrealistically noisy images (due to excessive sharpening and micro contrast ...) > Alaska v2? 19 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Future research and open questions Is there a better way
to perform color steganalysis (or more generally, multi-channel steganalysis)? How to make our approach more scalable to Unseen stego schemes Unseen cover processing, double JPEG compression, custom JPEG quantization, etc.? 20 / 20 Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain

Breaking ALASKA

Breaking ALASKA

Yassine Yousfi

More Decks by Yassine Yousfi

Other Decks in Research

Featured

Transcript

Breaking ALASKA: Color Separation for Steganalysis in JPEG Domain Yassine

ALASKA challenges, the way we saw them Color JPEGs Variable

Commitments and prior knowledge SRNet as a leading CNN architecture

Datasets For each JPEG quality factor 256×256 “tiles” Arbitrary sized

Winning architecture, QF ≤ 98 Multiclass SRNet ⎛ ⎝ ⎜

Color separation is the main trick Table: Color separation boost

Multiclass vs binary results conﬁrmed for JPEG domain [Butora EI’19]

Arbitrary sized images are steganalyzed using SRNets as feature extractors

Winning architecture, QF 99: The reverse JPEG compatibility attack Binary

The reverse JPEG compatibility attack Rounding error eij = zij

The reverse JPEG compatibility attack −0.5 0.0 0.5 0.6 0.7

The reverse JPEG compatibility attack Near perfect detector for QF100

Ordering ps computed using diﬀerent detectors: Calibration Comparing soft-outputs from

Ordering ps computed using diﬀerent detectors: Calibration Figure: Calibration plot

Final results Cover source mismatch impact: 400 double JPEG compressed

Bag of tricks: Curriculum Learning Two types of CL have

Bag of tricks, cont’d Warm start learning rate Using a

Low false alarm performance of SRNet 18 / 20 Breaking

Looking back Alaska was very challenging and gave birth to

Future research and open questions Is there a better way