ϕΫλάϥϑΟοΫͷσʔλߏ
Canvas
Image
Text
Text
Text
Text
Canvas
Image
Text
Text
Text Text
υΩϡϝϯτ
,
Ωϟϯόε ϨΠϠʔ
Width, Height,
Category, …
Type, Position, Size,
Appearance, Text, Pixels, …
N Inoue et al., LayoutDM: Discrete Diffusion Model for Controllable Layout Generation, CVPR 2023
N Inoue et al., Towards Flexible Multi-modal Document Models, CVPR 2023
࠷ۙͷऔΓΈ
02
ϚεΫ͖ΦʔτΤϯίʔμͷϚϧνλεΫ׆༻
● ϚϧνϞʔμϧͳBERTతͳϞσϧ→ϚεΫΓସ͑Ͱଟ༷ͳλεΫॲཧ
Design tasks
=
=
Masking patterns
Font & color prediction
Element filling
BEST IN TOWN!
CAR WASH
Full service
Type
Position
Img-emb.
Text-emb.
Color / font
context
[NULL]
[MASK]
1 2 3 4 5
1 2
3
4
5
Type
Position
Img-emb.
Text-emb.
Color / font
context
[NULL]
[MASK]
1 2 3 4 5
1 2
3
4
5