SketchODE: Learning neural sketch representation in continuous time

SketchODE: Learning neural sketch representation in continuous time Ayan Das1,2,
Yongxin Yang1,3, Timothy Hospedales1,3, Tao Xiang1,2, Yi-Zhe Song1,2 1SketchX, CVSSP, University of Surrey, UK 2iFlyTek-Surrey Joint research centre on AI 3University of Edinburgh, UK Accepted as poster @ ICLR ‘22

Chirographic Data: Handwriting, Sketches etc.

Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence
of discrete points

of discrete points • Disregards the true nature, which is continuous

of discrete points • Disregards the true nature, which is continuous QuickDraw

of discrete points • Disregards the true nature, which is continuous QuickDraw VectorMNIST* * VectorMNIST (newly introduced), vectorized version of MNIST, check https://ayandas.me/sketchode

Representing continuous time strokes • Previous approaches

Representing continuous time strokes • Previous approaches • Bezier curves
[2] Image taken from [2]

[2] • Differential Geometry [1] • etc … [1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020. [2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020. Image taken from [2] Image taken from [1]

[2] • Differential Geometry [1] • etc … [1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020. [2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020. Image taken from [2] Image taken from [1] Bezier curve based stroke + Autoregressive Generation

[2] • Differential Geometry [1] • etc … [1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020. [2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020. Image taken from [2] Image taken from [1] Diff. Geometry based stroke + Autoregressive Generation Bezier curve based stroke + Autoregressive Generation

Entire chirographic structure as one function

Entire chirographic structure as one function • Chirographic structures (including
pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.

Entire chirographic structure as one function [1] Tian Qi Chen,
Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018. • Chirographic structures (including pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]

pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.

pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018. Model

pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018. Model Solution/Inference

Learn latent representations for continuous time functions

Learn latent representations for continuous time functions • Use “Neural
CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.

CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020. Augmented ODE

CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020. Augmented ODE Second order

Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is
important

important • Either represent pen-ups as straight lines with a state bit

important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes

important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes Final state of the previous stroke is mapped to initial state of the next stroke (check the paper for more details)

important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes Final state of the previous stroke is mapped to initial state of the next stroke (check the paper for more details) “events”

Training/Implementation tricks

Training/Implementation tricks • Sin/Cos activation for dynamics functions • High
frequency temporal changes in trajectory

Training/Implementation tricks • Sin/Cos activation for dynamics functions • High
frequency temporal changes in trajectory • Continuous noise augmentation • More intuitive in continuous-time case

Reconstruction & Generation

Reconstruction & Generation • Faithful reconstruction, just like deterministic RNN-RNN

• Generation by injecting noise into latent space • RNN-RNNs break continuity due to autoregression.

• Generation by injecting noise into latent space • RNN-RNNs break continuity due to autoregression. One-shot Generation

Inherently smooth latent space

Inherently smooth latent space • Due to continuous nature of
latent-to-decoder mapping, “SketchODE” enjoys inherent continuity

Inherently smooth latent space • Due to continuous nature of
latent-to-decoder mapping, “SketchODE” enjoys inherent continuity One-shot Interpolation SketchODE SketchODE RNN-RNN RNN-RNN

Abstraction effect by squeezing frequency

Abstraction effect by squeezing frequency • We noticed a peculiar
property, i.e. “Abstraction effect”

property, i.e. “Abstraction effect” • Thanks to the periodic nature of activations and their frequencies

property, i.e. “Abstraction effect” • Thanks to the periodic nature of activations and their frequencies Decreasing frequency content

property, i.e. “Abstraction effect” • Thanks to the periodic nature of activations and their frequencies Decreasing frequency content Refer to the paper for more details

Thank You ! Check out the project page @ https://ayandas.me/sketchode

SketchODE: Learning neural sketch representatio...

SketchODE: Learning neural sketch representation in continuous time

More Decks by Ayan Das

Other Decks in Research

Featured

Transcript