NTIRE 2021 Learning the Super-Resolution Space Challenge

NTIRE 2021 Learning the Super-Resolution Space Challenge Sansan株式会社技術本部 DSOC
R&D Automation Group 内⽥奏第七回全⽇本コンピュータビジョン勉強会(後編) @2021/07/31

Data Strategy and Operation Center アジェンダ 1. 超解像とは 2. NTIREの概要・歴史
3. NTIRE 2021 Learning the Super-Resolution Space Challenge 1. 問題設定 2. 関連⽂献 4. 結果発表 5. 所感 ※図表は論⽂・発表資料より引⽤しています

Data Strategy and Operation Center 超解像とは⼊⼒信号の解像度を⾼めて出⼒する技術 i.e. ⾼解像度化 •
画像以外にも⾳声，電波，センシングの分野でも登場 • ⾼周波成分の復元を指す場合もある超解像画像(SR) 低解像度画像(LR)

Data Strategy and Operation Center 問題設定超解像は画像復元問題の⼀種 • 低解像度(LR)画像は⾼解像度(HR)画像が劣化して⽣成されると仮定 •
劣化𝒟の逆変換ℱを求めることが⽬標超解像画像 𝐼!" 低解像度画像 𝐼#" 復元劣化⾼解像度画像 𝐼$" 𝒟 𝐼$" ℱ 𝐼#"

Data Strategy and Operation Center NTIREとは CVPR併設の画像復元・強調分野のワークショップ • ETH Zurich
Computer Vision Lab が主導 • 関連タスクのコンペティションを同時開催

本発表では超解像分野にズームイン!!

Data Strategy and Operation Center NTIRE Challengeの歴史 ① 2017, 2018年
• DIV2K [Agustsson+ CVPRW2017] データセットを⽤いてPSNR/SSIMを競う • ネットワーク構造の探索・深層化がメインテーマ > e.g. EDSR [Lim+ CVPRW2017], DBPN [Harris+ CVPR2018] • Perception-Distortion Tradeoff [Blau+ CVPR2018] が提唱，知覚的品質が重要視 EDSRの構造 DBPNの構造 Perception-Distortion Tradeoff

Data Strategy and Operation Center NTIRE Challengeの歴史 ② 2019年 •
実応⽤に向けて頑張る潮流が強くなる > 参考: 【Intern CV Report】CVPR2019における超解像 – Sansan Builders Blog • RealSR [Cai+ CVPR2019] データセットを⽤いた倍率不明コンペ • U-shapedなネットワークで，マルチスケールに解くなど学習を⼯夫 > e.g. U-Net+MixUp [Feng+ CVPRW2019] U-shapedなネットワークの例 CutBlur [Yoo+ CVPR2020] へ発展

Data Strategy and Operation Center NTIRE Challengeの歴史 ③ 2020年 •
Real-World Super-Resolution: Ground-truthが得られない問題設定 > 「iPhoneで撮った画像を拡⼤したい!」 → 対応した⾼解像度画像は存在しない > Noisy LR画像セット & クリーンなHR画像セットが提供 > Kernel estimation, Noise injection を⽤いた⼿法が優勝 [Ji+ CVPRW2020]

Data Strategy and Operation Center Real-World Super-Resolution Challengeの結果 ※詳しくは【Zoom or
Die】第1回 NTIRE2020 Challenge 結果速報 - Sansan Builders Blog 👑

ここから本題

GIF animation from https://github.com/andreas128/NTIRE21_Learning_SR_Space

Data Strategy and Operation Center Learning the Super-Resolution Space Challenge
LR画像対して出⼒可能なSR画像の空間を学習するコンペティション • ill-posed natureをより良く考慮した学習の定式化を⽬指す • 複数の指標でSR Spaceを評価し，相互関係・ベースラインを確⽴ • 制御可能なSR Spaceの探索・結果の修正にも期待 Many-to-oneな縮⼩の逆変換を構築

Data Strategy and Operation Center レギュレーション Submission • 1つのLR画像に対して10枚のSR画像(x4, x8)を提出
Rules • モデルから任意枚数をサンプルできること > 枚数に上限があるモデル i.e. 最終層が複数あるみたいなモデルは禁⽌ • シングルモデルであること • Self-ensemble, Test-time augmentationを⾏わないこと • 全てのサンプルは同じハイパーパラメータから出⼒されること • DIV2Kのdata splitを除き，任意の事前学習は可能

Data Strategy and Operation Center 評価⽅法 Photo-realism • User-study で
Mean Opinion Rank (MOR) を算出 > 各参加者の提出物をランク付し，順位の平均を取った数値 The spanning of the SR Space • 意味的な多様性を持っているかを評価したい (≠ 画素レベルのバラつき) > 必ずしも最⼤化すればいいわけではない • Ground-truthとのLPIPS [Zhang+ CVPR2018] を使って多様性を評価 (下式) Low Resolution Consistency • SR画像をbicubic縮⼩し，LR画像とのPSNRで評価

Data Strategy and Operation Center 関連⽂献: SRFlow [Lugmayr+ ECCV2020] Flowベースの超解像⼿法
• 可逆なネットワーク構造を⽤いて潜在変数を学習 > 単⼀のネットワークでエンコード/デコードできる > 対数尤度を直接最適化できる i.e. reparameterization trickなどが必要ない • LR画像で条件付けしたFlowでSR Spaceを学習他⼿法との⽐較 VAEとFlowの⽐較 [Weng 2018]

Data Strategy and Operation Center SRFlowの構造

Data Strategy and Operation Center 関連⽂献: IRN [Xiao+ ECCV2020] 可逆なネットワークで拡⼤⇆縮⼩をモデリング
• 変数分割にHaar Transformationを採⽤ > ⾼周波成分と低周波成分を明⽰的に分離してカップリング • 学習⾃体はreconstruction loss, perceptual loss, JS divergenceを⽤いる > SRFlow は NLL loss のみで学習

結果発表

Data Strategy and Operation Center 結果発表 (lower is better)

Data Strategy and Operation Center 結果発表 (lower is better) Flowベースの⼿法が有⼒

Data Strategy and Operation Center 定性的⽐較 👑 Deterministic GANベース Flowベース

Data Strategy and Operation Center サンプル間の⽐較テクスチャの多様性を確認

Data Strategy and Operation Center サンプル数による多様性の変化 x4 x8 サンプル数を増やすと多様性指標が向上 (どこかに収束しそう)

Data Strategy and Operation Center Winner solution [Kim+ CVPRW2021] SRFlowのConditional
Flow StepにNoise Condition Layerを挿⼊ • 学習時に⼊⼒画像にノイズ付与 & リサイズしたノイズマップをLayerに⼊⼒ • 多様性の向上に寄与⼊⼒にノイズを⼊れて学習した結果 →Noise Condition Layerで対処ネットワーク構造

Data Strategy and Operation Center njtech&seu (x4 2nd, x8 6th)
Low Resolution EncoderにTransformerを導⼊ • Image Processing Transformer (IPT) [Chen+ CVPR2021] に着想?

Data Strategy and Operation Center 所感 Flow-based methods は強い •
応⽤⼀辺倒な分野だったが，理論的に踏み込める⽷⼝かも • SRFlowが強すぎてコンペ参加者は⼿のひらで転がされてる感どこまでが Super-Resolution / Up-sampling? • 映っている物体を「実直に」拡⼤している印象を与える > ⾼倍率だとデータセットのバイアスが強く影響 e.g. PULSE [Menon+ CVPR2020] > Conditional Image Generation under LR-constraints の⽅が誤解がないかも • SR Space の広さについて研究するのが重要 > 学習した空間内で発⽣する意味的な変化に制約をかける等 > タスクごとに許容できる倍率・多様性について議論が必要

Data Strategy and Operation Center 参考⽂献 [Wang 2018] L. Weng,
“Flow-based deep generative models,” Oct. 13, 2018. https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html (accessed Jul. 31, 2021). [Zhang+ CVPR2018] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595. [Ji+ CVPR2020] X. Ji, Y. Cao, Y. Tai, C. Wang, J. Li, and F. Huang, “Real-world super-resolution via kernel estimation and noise injection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 466–467. [Yoo+ CVPR2020] J. Yoo, N. Ahn, and K.-A. Sohn, “Rethinking data augmentation for image super-resolution: A comprehensive analysis and a new strategy,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8375–8384. [Feng+ CVPRW2019] R. Feng, J. Gu, Y. Qiao, and C. Dong, “Suppressing model overfitting for image super-resolution networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 0–0. [Blau+ CVPR2018] Y. Blau and T. Michaeli, “The perception-distortion tradeoff,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6228–6237. [Harris+ CVPR2018] M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1664–1673.

Data Strategy and Operation Center 参考⽂献 [Lim+ CVORW2017] B. Lim,
S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 136–144. [Agstsson+ CVPRW2017] E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 126–135. [Lugmayr+ CVPRW2021] A. Lugmayr, M. Danelljan, and R. Timofte, “NTIRE 2021 learning the super-resolution space challenge,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 596–612. [Kim+ CVPR2021] Y. Kim and D. Son, “Noise Conditional Flow Model for Learning the Super-Resolution Space,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 424–432. [Lugmayr+ ECCV2020] A. Lugmayr, M. Danelljan, L. Van Gool, and R. Timofte, “SRFlow: Learning the Super-Resolution Space with Normalizing Flow,” in Computer Vision – ECCV 2020, 2020, pp. 715–732. [Chen+ CVPR2021] H. Chen et al., “Pre-trained image processing transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12299–12310. [Xiao+ ECCV2020] M. Xiao et al., “Invertible Image Rescaling,” in Computer Vision – ECCV 2020, 2020, pp. 126–144. [Menon + CVPR2020] S. Menon, A. Damian, S. Hu, N. Ravi, and C. Rudin, “Pulse: Self-supervised photo upsampling via latent space exploration of generative models,” in Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2020, pp. 2437–2445.

NTIRE 2021 Learning the Super-Resolution Space ...

NTIRE 2021 Learning the Super-Resolution Space Challenge

Sansan DSOC

More Decks by Sansan DSOC

Other Decks in Science

Featured

Transcript