LPixelLT20190419R.pdf

Quantifying the effects of data augmentation and stain color normalization
in convolutional neural networks for computational pathology https://arxiv.org/abs/1902.06543 (Submitted on 18 Feb 2019) David Tellez et al. Diagnostic Image Analysis Group and the Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands LPixel Inc. Presents Image Analysis x Machine Learning #1 19th Apr. 2019 @LPixel Inc. 発表者 Tu-chan

はじめに § ⼀般の病理医は顕微鏡を覗いて細胞の形態・組織の構築を観察し，病気の診断をしている § 最近は，スライドガラス1枚を全てデジタル化する，スライドスキャナーの技術の進歩によりデジタル画像での診断，研究が進んでいる（Computational pathology） § しかし，画像を作る過程の様々な要因によって，画像の⾊合いが変化することはよく知られており，これが病理医の判断や画像解析の結果に影響を及ぼすのではないかという懸念がある
§ また，単施設の画像のみを⽤いてtrainingしたモデルを他施設の画像に適⽤すると期待した精度が出ないということもよく報告されている画像のデジタル化⾊合いが異なる画像でも⼤丈夫か

Screenshot 2019-04-23 12.26.25Screenshot 2019- 04-23 12.26.25 Hematoxylin and eosin staining
ヘマトキシリン・エオジン（HE）染⾊核は濃い紫⾊（ヘマトキシリン）細胞質はピンク⾊（エオジン） Whole-slide image (WSI) この論⽂で⽤いられたデータセット

ヘマトキシリン＆エオジン（HE）染色スライドスキャナーでデジタル化顕微鏡で観察ホルマリン固定液切り分けるバイオプシー脱水・洗浄・加温顕微鏡で観察する面ホルマリン固定デジタル化されたHE染色画像
HE標本ができるまで HE染⾊後のガラスパラフィン包埋ブロック厚さ2~4マイクロに切る染⾊前のガラスガラスにのせる全体で約2⽇間

切る⼈の技術検体（臓器）の種類施設間の差モニターの種類・⾊設定スキャナーベンダーの違い染⾊の⾊合いに影響を与える要素部屋（染⾊液）の温度部屋の温度，反応時間最終的にできるHE染⾊ https://slideplayer.com/slide/11404136/
のP15より

HE画像識別の汎化性能を上げる⼿法 1. Stain color augmentation (data augmentation) 1) Morphological transformation
Basic: rotation, mirroring, scaling Advanced: elastic deformation, Gaussian noise/blurring 2) Color transformation Common: brightness, contrast, hue More tailored: HE color deconvolution his article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMI.2018.2820199, IEEE Transactions on Medical Imaging IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. X, NO. Y, MONTH YEAR. 5 Stain Transform Elastic Deformation Scaling Image Enhancement Blurring Gaussian Noise Combination Fig. 5: Multiple augmented versions of the same mitotic patch. Each column shows samples of a single augmentation function except for the last one, which combines all the techniques together with rotation Fig. 6: H&E stain augmentation. From left to right: first, an RGB patch is decomposed into hematoxylin (Hch ), eosin (Ech ) and residual (Rch ) color channels. Then, each channel is individually modified with a random factor and bias. Finally, resulting channels are transformed back to RGB color space. edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMI.2018.2820199, IEEE ons on Medical Imaging EAR. 5 Original patch RGB Stain augmented patch RGB Hematoxylin channel (Hch) Eosin channel (Ech) Residual channel (Rch) Modified hematoxylin channel (Hch) Modified eosin channel (Ech) Modified residual channel (Rch) α1∙Hch + β1 α2∙Ech + β2 α3∙Rch + β3 ide o OD with OD intensity less than β on the OD tuples om the SVD directions corresponding to the gular values to the plane, and normalize to unit length e of each point wrt the first SVD direction remes (αth and (100−α)th percentiles) of the me values back to OD space mal Stain Vectors VD-geodesic method for obtaining stain vec- he pixel his- line is calcu- value decom- This results in (b) This is the histogram of an- gles that the points form with the geodesic line shown in Fig- ure 3(a). The color corresponds Fig. 4. This shows the calculated stain vectors for twelve hematoxylin and eosin stained test slides. The color of each symbol corresponds to what would be produced by that vector. The stars are the standard vectors used without regard to the specific slide. The circles are the automatically computed stain vectors. Notice that all recovered vectors are significantly different from the standard vectors. 2. Stain color normalization 1) color deconvolution matrix 2) detect certain morphological structures 3) generative models 4) context-based normalization Macenko et al. 2009 Tellez et al. 2018 Bejnordi et al. 2015 “This lack of improvement likely stems from our extensive color perturbations encouraging our models to learn color-insensitive features, and thus the color normalization was unnecessary.” (Google Brain 2017, https://arxiv.org/pdf/1703.02442.pdf)

Stain color augmentation 1. Basic: 90’ rotations, vertical/horizontal mirroring 2.
Morphology: Basic + scaling, elastic deformation, additive Gaussian noise, Gaussian blurring 3. Brightness & contrast (BC): Basic + random BC perturbations 4. Hue-Saturation-Value (HSV) : Basic + random HSV shifting (-light[-0.1,0.1]/strong[-1,1]) 5. Hematoxylin-Eosin-DAB (HED) : Basic + random HE shifting (-light[-0.05,0.05]/strong[-0.2,0.2]) § ⾃施設のWSIをtraining data setとする WSI数14~50, パッチ数~1M § 他の施設またはコンテストデータをtest set § Color augmentationとnormalizationの⼿法の組み合わせで，どれが最もtest setにおけるAUCが良いかを⽐較する（多クラスではweighted average AUC） § 4臓器全部を合わせたGlobal ranking 各臓器のorgan rankingの平均値と，全臓器の worst rankingを⾜して2で割る

or variation strength, called HED-light and ning, we selected the
value of the augmen- parameters randomly within certain ranges in variation. We tuned all ranges manually amination. In particular, we used a scaling n [0.8, 1.2], elastic deformation parameters and 2 [9.0, 11.0], additive Gaussian noise 0.1], Gaussian blurring with 2 [0, 0.1], ensity ratio between [0.65, 1.35], and con- y ratio between [0.5, 1.5]. For HSV-light ng, we used hue and saturation intensity ra- [ 0.1, 0.1] and [ 1, 1], respectively. For d HED-strong, we used intensity ratios be- , 0.05] and [ 0.2, 0.2], respectively, for all s. or normalization Figure 4: Network-based stain color normalization. From left to right: patches from the training set are transformed with heavy color augmentation and fed to a neural network. This network is trained to reconstruct the original appearance of the input images by removing color augmentation, e↵ectively learning how to perform stain color normalization. alize to unseen stains in order to perform well. We eval- uated several methods that implement g (see Fig. 3), and propose a novel technique based on neural networks. Identity. We performed no transformation on the input patches, serving as a baseline method for the rest of techniques. Stain color normalization 1. Identity: 何もしない 2. Grayscale: RGB to grayscale - ⾊情報を除く (augmentationはbasic, morphology, BCのみ) 3. LUT-based 核を検出して⾊の標準化テンプレWSIからlook-up table (LUT)を作成する 4. Network-based（右図） Downward 5 layers, BN, LRA Upward 5 layers, nearest-neighbor upsampling + 1 conv BN, LRA+tanh, 64-sample mini-batch 4臓器のWSIから 500Kパッチ集めて使⽤ HSV augmentation (color transformation only) HSV value channel ratios b/w [-1, 1]

§ Stain color augmentationの効果 - Color normalizationの種類によらず，augmentationは必須 - ⾊を変換するHSVまたはHEVが効果的（HSV/HED,
light/strongでの差は認められず） § Stain color normalizationの効果 - Network-basedを⽤いたペアが最も良い成績 - LUT-based はidentityと同等 - Color normalization単独では不⼗分 - Grayscaleの性能は良くない= やはり⾊の情報は有⽤しかし， § ⾊のnormalizationがうまくできていてもなお⾊情報が過学習の原因となりうる（ノイズになる） § Network-based normalizationを⽤いても軽微なAUC増加であり，余分な計算量に値するかは疑問結果と考察

まとめ § 病理診断の基本はHE染⾊による形態学であるが，施設間の環境・設備の違いによる染⾊のvariationが病理医の判断・画像解析結果に影響を与えうる § Color augmentationとnormalizationの併⽤が解決策の⼀つであるが，CNNによる病理画像分類においては前者が劇的に汎化性能を上げた（HSV/HED変換は必須） § ⾊のnormalizationは⾒た⽬は統⼀されるが単独では効果不⼗分であり，適切な
augmentationがなければ，ノイズになったり過学習をきたす § Network-based normalizationは計算量が多い割に精度の向上が軽微であり，省略することも可能である（その場合は強めの⾊変換を）

LPixelLT20190419R.pdf

LPixelLT20190419R.pdf

Tsuyama

More Decks by Tsuyama

Featured

Transcript

Quantifying the effects of data augmentation and stain color normalization

Screenshot 2019-04-23 12.26.25Screenshot 2019- 04-23 12.26.25 Hematoxylin and eosin staining

ヘマトキシリン＆エオジン（HE）染色スライドスキャナーでデジタル化顕微鏡で観察ホルマリン固定液切り分けるバイオプシー脱水・洗浄・加温顕微鏡で観察する面ホルマリン固定デジタル化されたHE染色画像

HE画像識別の汎化性能を上げる⼿法 1. Stain color augmentation (data augmentation) 1) Morphological transformation

Stain color augmentation 1. Basic: 90’ rotations, vertical/horizontal mirroring 2.

or variation strength, called HED-light and ning, we selected the

§ Stain color augmentationの効果 - Color normalizationの種類によらず，augmentationは必須 - ⾊を変換するHSVまたはHEVが効果的（HSV/HED,