Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MIRU2024_招待講演_RALF_in_CVPR2024

Udon
August 05, 2024

 MIRU2024_招待講演_RALF_in_CVPR2024

Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Daichi Horita, Naoto Inoue, Kotaro Kikuchi, Kota Yamaguchi, Kiyoharu Aizawa
CVPR, 2024.

Project Website: https://udonda.github.io/RALF/
GitHub Repository: https://github.com/CyberAgentAILab/RALF
Paper: https://arxiv.org/abs/2311.13602

Abstract
Content-aware graphic layout generation aims to automatically arrange visual elements along with a given content, such as an e-commerce product image. In this paper, we argue that the current layout generation approaches suffer from the limited training data for the high-dimensional layout structure. We show that a simple retrieval augmentation can significantly improve the generation quality. Our model, which is named Retrieval-Augmented Layout Transformer (RALF), retrieves nearest neighbor layout examples based on an input image and feeds these results into an autoregressive generator. Our model can apply retrieval augmentation to various controllable generation tasks and yield high-quality layouts within a unified architecture. Our extensive experiments show that RALF successfully generates content-aware layouts in both constrained and unconstrained settings and significantly outperforms the baselines.

# 論文についての解説はこちらのスライドをご参照ください。
https://speakerdeck.com/udonda/cvpr24-oral-retrieval-augmented-layout-transformer-for-content-aware-layout-generation

# 開催概要
トップカンファレンス・トップジャーナルに採択された論文の著者を「招待講演」として招待し、その内容だけでなく、厳しい競争と査読に耐えうる研究・論文に仕上げる過程を多くの聴衆と共有します。これにより、トップカンファレンス等へ挑戦する学生や研究者が数多く生まれることを期待します。

Udon

August 05, 2024
Tweet

More Decks by Udon

Other Decks in Research

Transcript

  1. RALF 3 Retrieval-Augmented Layout Transformer 1) Retrieve layouts based on

    an input image 2) use them as to augment the generation process.
  2. 4 CVPR 2024 Oral 90 / 12,000 = 0.75 %

    ݋ಓ8ஈ ߹֨཰ = 0.8%
  3. Period: 24/8 ~ 11 i.e. 3 month 5 200K lines

    Start Deadline GitHub log 1100 lines / day
  4. Schedule with GitHub log 6 10/15 Version1-draft 9/20 SoTA 8/30

    Finish re-imp. 8/15 Read paper 11/17 Submit
  5. First citation from CGB-DM [Li+ arXiv24] 7 “Bene fi ting

    from the contributions made by RALF in this fi eld, we conducted fair comparisons based on multiple baseline re-implementations by the RALF team” in an experiment section. ͜ͷ෼໺ʹ͓͚ΔRALFνʔϜͷϕʔεϥΠϯެ։ ͷߩݙͷ͓͔͛Ͱզʑ͸ϑΣΞͳ࣮ݧ͕Ͱ͖ͨ > ެ։͢ΔͱͳΜ͔خ͍͜͠ͱ͕ى͜Δ
  6. 8 ֮ޛͱܾஅ Commitment and Decision-Making ࢮΛҙࣝ͠ɺ໎Θܾͣஅ Embrace death, decide without

    hesitation ɹ ɹAlways make the best move ෢࢜ಓͱ͍;͸ࢮ͵ࣄͱݟ͚ͭͨΓ The Bushido is worth dying for Like “Learn or Die” in PFN
  7. 11 •I visited Nicu Sebe in Italy from Jan to

    July in 2023. •11/15 papers (73 %) were accepted at CVPR24 from his group. •Learned a lot from Italian guys:
  8. 12 Enjoy to discuss “seriously !” •I visited Nicu Sebe

    in Italy from Jan to July in 2023. •11/15 papers (73 %) were accepted at CVPR24 from his group. •Learned a lot from Italian guys: