Slide 1

Slide 1 text

Intern a tion a l Conference on Le a rning Represent a tion (ICLR) 2023 ChiroDiff: Modelling chirographic data with Diffusion Models Ay a n D a s 1,2, Yongxin Y a ng 1,3, Timothy Hosped a les 1,4,5, T a o Xi a ng 1,2, Yi-Zhe Song 1,2 1 SketchX L a b, University of Surrey; 2 iFlyTek-Surrey Joint Rese a rch Centre on AI; 3 Queen M a ry University of London; 4 University of Edinburgh; 5 S a msung AI Center C a mbridge

Slide 2

Slide 2 text

Raster vs Vector for sparse structures Gr a phics/Vision models mostly de a l with grid-b a sed r a ster im a ges !!

Slide 3

Slide 3 text

Raster vs Vector for sparse structures Gr a phics/Vision models mostly de a l with grid-b a sed r a ster im a ges !! Generic Representation (Non-optimised for sparse structures)

Slide 4

Slide 4 text

Raster vs Vector for sparse structures Gr a phics/Vision models mostly de a l with grid-b a sed r a ster im a ges !! Generic Representation (Non-optimised for sparse structures) Specialized Representation (Optimised for sparsity)

Slide 5

Slide 5 text

Raster vs Vector for sparse structures Gr a phics/Vision models mostly de a l with grid-b a sed r a ster im a ges !! Generic Representation (Non-optimised for sparse structures) Specialized Representation (Optimised for sparsity)

Slide 6

Slide 6 text

Chirographic Data: Handwriting, Sketches etc. Gener a tive Modelling a nd m a nipul a tion [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022. [2] KanjiVG dataset: https://kanjivg.tagaini.net/ [3] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018.

Slide 7

Slide 7 text

Chirographic Data: Handwriting, Sketches etc. Gener a tive Modelling a nd m a nipul a tion English Digits[1] (Simple) [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022. [2] KanjiVG dataset: https://kanjivg.tagaini.net/ [3] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018.

Slide 8

Slide 8 text

Chirographic Data: Handwriting, Sketches etc. Gener a tive Modelling a nd m a nipul a tion English Digits[1] (Simple) Chinese Characters[2] (Complex compositional structure) [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022. [2] KanjiVG dataset: https://kanjivg.tagaini.net/ [3] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018.

Slide 9

Slide 9 text

Chirographic Data: Handwriting, Sketches etc. Gener a tive Modelling a nd m a nipul a tion English Digits[1] (Simple) Chinese Characters[2] (Complex compositional structure) Sketches[3] (Freehand, Noisy) [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022. [2] KanjiVG dataset: https://kanjivg.tagaini.net/ [3] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018.

Slide 10

Slide 10 text

Popular auto-regressive generative models One segment/point a t a time [1] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018. [2] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. BezierSketch: A generative model for scalable vector sketches. In ECCV, 2020.

Slide 11

Slide 11 text

Popular auto-regressive generative models One segment/point a t a time Input Output [1] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018. [2] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. BezierSketch: A generative model for scalable vector sketches. In ECCV, 2020.

Slide 12

Slide 12 text

Popular auto-regressive generative models One segment/point a t a time Input Output Control Points instead of Segments[1] [1] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018. [2] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. BezierSketch: A generative model for scalable vector sketches. In ECCV, 2020.

Slide 13

Slide 13 text

Popular auto-regressive generative models One segment/point a t a time Input Output Control Points instead of Segments[1] [1] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018. [2] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. BezierSketch: A generative model for scalable vector sketches. In ECCV, 2020. p (si |si−1 ; θ) Learning “drawing dynamics”[1, 2]

Slide 14

Slide 14 text

Popular auto-regressive generative models One segment/point a t a time Input Output Control Points instead of Segments[1] [1] D. Ha and D. Eck. A neural representation of sketch drawings. In ICLR, 2018. [2] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. BezierSketch: A generative model for scalable vector sketches. In ECCV, 2020. p (si |si−1 ; θ) Learning “drawing dynamics”[1, 2] p(s0 , s1 , ⋯; θ) Learning “holistic concepts”

Slide 15

Slide 15 text

Some newer approaches Continuous-time Model[1] of chirogr a phic d a t a [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022.

Slide 16

Slide 16 text

Some newer approaches Continuous-time Model[1] of chirogr a phic d a t a [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022. Learns holistic concept as Vector Field

Slide 17

Slide 17 text

Some newer approaches Continuous-time Model[1] of chirogr a phic d a t a [1] A. Das, Y. Yang, T. M. Hospedales, T. Xiang, and Y. Z. Song. SketchODE: Learning neural sketch representation in continuous time. In ICLR, 2022. Learns holistic concept as Vector Field Over-smoothening Training di ffi culty of underlying tools

Slide 18

Slide 18 text

“ChiroDiff” is our solution Model chirogr a phic sequence in non- a utoregressive m a nner

Slide 19

Slide 19 text

“ChiroDiff” is our solution Model chirogr a phic sequence in non- a utoregressive m a nner p(s0 , s1 , ⋯; θ) Learns holistic concepts, not dynamics

Slide 20

Slide 20 text

“ChiroDiff” is our solution Model chirogr a phic sequence in non- a utoregressive m a nner p(s0 , s1 , ⋯; θ) Di ff usion Models allow us to realise this Learns holistic concepts, not dynamics

Slide 21

Slide 21 text

“ChiroDiff” is our solution Model chirogr a phic sequence in non- a utoregressive m a nner p(s0 , s1 , ⋯; θ) Di ff usion Models allow us to realise this Learns holistic concepts, not dynamics No over-smoothening, still discrete Much easier to train, as with any Di ff usion Models Allows variable length and length conditioning

Slide 22

Slide 22 text

Our framework St a nd a rd noising, Non- a utoregressive sequence De-noiser

Slide 23

Slide 23 text

Our framework St a nd a rd noising, Non- a utoregressive sequence De-noiser Reverse process can modify any part of the sequence at any time .. .. unlike auto-regressive models

Slide 24

Slide 24 text

Reverse Generative Process Bi-RNN or Tr a nsformer Encoder (w/ PE) a s le a rn a ble Denoiser Image sources: [1] https://d2l.ai/chapter_recurrent-modern/bi-rnn.html [2] https://jalammar.github.io/illustrated-transformer/ [1]

Slide 25

Slide 25 text

Reverse Generative Process Bi-RNN or Tr a nsformer Encoder (w/ PE) a s le a rn a ble Denoiser Image sources: [1] https://d2l.ai/chapter_recurrent-modern/bi-rnn.html [2] https://jalammar.github.io/illustrated-transformer/ [1]

Slide 26

Slide 26 text

Reverse Generative Process Bi-RNN or Tr a nsformer Encoder (w/ PE) a s le a rn a ble Denoiser Image sources: [1] https://d2l.ai/chapter_recurrent-modern/bi-rnn.html [2] https://jalammar.github.io/illustrated-transformer/ [1] [2] OR

Slide 27

Slide 27 text

Unconditional Generation High-qu a lity gener a tions

Slide 28

Slide 28 text

Unconditional Generation High-qu a lity gener a tions

Slide 29

Slide 29 text

Properties of our Model (1) Implicit conditioning a nd he a ling

Slide 30

Slide 30 text

Properties of our Model (1) Implicit conditioning a nd he a ling Di ff erent degree of correlation

Slide 31

Slide 31 text

Properties of our Model (1) Implicit conditioning a nd he a ling Di ff erent degree of correlation Healing badly drawn sketches

Slide 32

Slide 32 text

Properties of our Model (2) Stoch a stic recre a tion, Sem a ntic interpol a tion

Slide 33

Slide 33 text

Properties of our Model (2) Stoch a stic recre a tion, Sem a ntic interpol a tion Inferring drawing topology given perceptive input (ink-clouds)

Slide 34

Slide 34 text

Properties of our Model (2) Stoch a stic recre a tion, Sem a ntic interpol a tion Inferring drawing topology given perceptive input (ink-clouds) Interpolation between samples (with deterministic DDIM latent space)

Slide 35

Slide 35 text

Properties of our Model (3) Twe a king the reverse process v a ri a nce

Slide 36

Slide 36 text

Properties of our Model (3) Twe a king the reverse process v a ri a nce Controlled level of abstraction

Slide 37

Slide 37 text

Thank you. Read the paper or visit our website for more information https://ayandas.me/chirodi ff /