Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Daichi Horita, Naoto Inoue, Kotaro Kikuchi, Kota Yamaguchi, Kiyoharu Aizawa
CVPR, 2024.
Project Website: https://udonda.github.io/RALF/
GitHub Repository: https://github.com/CyberAgentAILab/RALF
Paper: https://arxiv.org/abs/2311.13602
Abstract
Content-aware graphic layout generation aims to automatically arrange visual elements along with a given content, such as an e-commerce product image. In this paper, we argue that the current layout generation approaches suffer from the limited training data for the high-dimensional layout structure. We show that a simple retrieval augmentation can significantly improve the generation quality. Our model, which is named Retrieval-Augmented Layout Transformer (RALF), retrieves nearest neighbor layout examples based on an input image and feeds these results into an autoregressive generator. Our model can apply retrieval augmentation to various controllable generation tasks and yield high-quality layouts within a unified architecture. Our extensive experiments show that RALF successfully generates content-aware layouts in both constrained and unconstrained settings and significantly outperforms the baselines.
# 論文についての解説はこちらのスライドをご参照ください。
https://speakerdeck.com/udonda/cvpr24-oral-retrieval-augmented-layout-transformer-for-content-aware-layout-generation
# 開催概要
トップカンファレンス・トップジャーナルに採択された論文の著者を「招待講演」として招待し、その内容だけでなく、厳しい競争と査読に耐えうる研究・論文に仕上げる過程を多くの聴衆と共有します。これにより、トップカンファレンス等へ挑戦する学生や研究者が数多く生まれることを期待します。