SketchODE: Learning neural sketch
representation in continuous time
Ayan Das1,2, Yongxin Yang1,3, Timothy Hospedales1,3, Tao Xiang1,2, Yi-Zhe Song1,2
1SketchX, CVSSP, University of Surrey, UK
2iFlyTek-Surrey Joint research centre on AI
3University of Edinburgh, UK
Accepted as poster @ ICLR ‘22

Chirographic Data: Handwriting, Sketches etc.

Chirographic Data: Handwriting, Sketches etc.
• Usually represented as sequence of discrete points

Chirographic Data: Handwriting, Sketches etc.
• Usually represented as sequence of discrete points

Chirographic Data: Handwriting, Sketches etc.
• Usually represented as sequence of discrete points
• Disregards the true nature, which is continuous

Chirographic Data: Handwriting, Sketches etc.
• Usually represented as sequence of discrete points
• Disregards the true nature, which is continuous

Chirographic Data: Handwriting, Sketches etc.
• Usually represented as sequence of discrete points
• Disregards the true nature, which is continuous
QuickDraw

Chirographic Data: Handwriting, Sketches etc.
• Usually represented as sequence of discrete points
• Disregards the true nature, which is continuous
QuickDraw
VectorMNIST*
* VectorMNIST (newly introduced), vectorized version of MNIST, check https://ayandas.me/sketchode

Representing continuous time strokes
• Previous approaches

Representing continuous time strokes
• Previous approaches
• Bezier curves [2]
Image taken from [2]

Representing continuous time strokes
• Previous approaches
• Bezier curves [2]
• Differential Geometry [1]
• etc …
[1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020.
[2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020.
Image taken from [2]
Image taken from [1]

Representing continuous time strokes
• Previous approaches
• Bezier curves [2]
• Differential Geometry [1]
• etc …
[1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020.
[2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020.
Image taken from [2]
Image taken from [1]
Bezier curve based stroke
+
Autoregressive Generation

Representing continuous time strokes
• Previous approaches
• Bezier curves [2]
• Differential Geometry [1]
• etc …
[1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020.
[2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020.
Image taken from [2]
Image taken from [1]
Diff. Geometry based stroke
+
Autoregressive Generation
Bezier curve based stroke
+
Autoregressive Generation

Entire chirographic structure as one function

Entire chirographic structure as one function
• Chirographic structures (including pen-up events) as one continuous-
time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.

Entire chirographic structure as one function
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.
• Chirographic structures (including pen-up events) as one continuous-
time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]

Entire chirographic structure as one function
• Chirographic structures (including pen-up events) as one continuous-
time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.

Entire chirographic structure as one function
• Chirographic structures (including pen-up events) as one continuous-
time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.

Entire chirographic structure as one function
• Chirographic structures (including pen-up events) as one continuous-
time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.
Model

Entire chirographic structure as one function
• Chirographic structures (including pen-up events) as one continuous-
time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.
Model
Solution/Inference

Learn latent representations for continuous time
functions

Learn latent representations for continuous time
functions
• Use “Neural CDE” [1] to encoder the data and “Neural ODE” [2] to
decode it – an Autoencoder setup.
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018.
[2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.

Learn latent representations for continuous time
functions
• Use “Neural CDE” [1] to encoder the data and “Neural ODE” [2] to
decode it – an Autoencoder setup.
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018.
[2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.

Learn latent representations for continuous time
functions
• Use “Neural CDE” [1] to encoder the data and “Neural ODE” [2] to
decode it – an Autoencoder setup.
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018.
[2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.
Augmented ODE

Learn latent representations for continuous time
functions
• Use “Neural CDE” [1] to encoder the data and “Neural ODE” [2] to
decode it – an Autoencoder setup.
[1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018.
[2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.
Augmented ODE
Second order

Full-sequence & Multi-stroke format
• Exact format of 𝑠(𝑡) is important

Full-sequence & Multi-stroke format
• Exact format of 𝑠(𝑡) is important
• Either represent pen-ups as straight lines with a state bit

Full-sequence & Multi-stroke format
• Exact format of 𝑠(𝑡) is important
• Either represent pen-ups as straight lines with a state bit
• Or, as a sequence of individual strokes

Full-sequence & Multi-stroke format
• Exact format of 𝑠(𝑡) is important
• Either represent pen-ups as straight lines with a state bit
• Or, as a sequence of individual strokes

Full-sequence & Multi-stroke format
• Exact format of 𝑠(𝑡) is important
• Either represent pen-ups as straight lines with a state bit
• Or, as a sequence of individual strokes
Final state of the previous stroke is mapped to initial
state of the next stroke
(check the paper for more details)

Full-sequence & Multi-stroke format
• Exact format of 𝑠(𝑡) is important
• Either represent pen-ups as straight lines with a state bit
• Or, as a sequence of individual strokes
Final state of the previous stroke is mapped to initial
state of the next stroke
(check the paper for more details)
“events”

Training/Implementation tricks

Training/Implementation tricks
• Sin/Cos activation for dynamics functions
• High frequency temporal changes in trajectory

Training/Implementation tricks
• Sin/Cos activation for dynamics functions
• High frequency temporal changes in trajectory
• Continuous noise augmentation
• More intuitive in continuous-time case

Reconstruction & Generation

Reconstruction & Generation
• Faithful reconstruction, just like deterministic RNN-RNN

Reconstruction & Generation
• Faithful reconstruction, just like deterministic RNN-RNN
• Generation by injecting noise into latent space
• RNN-RNNs break continuity due to autoregression.

Reconstruction & Generation
• Faithful reconstruction, just like deterministic RNN-RNN
• Generation by injecting noise into latent space
• RNN-RNNs break continuity due to autoregression.
One-shot Generation

Inherently smooth latent space

Inherently smooth latent space
• Due to continuous nature of latent-to-decoder mapping, “SketchODE”
enjoys inherent continuity

Inherently smooth latent space
• Due to continuous nature of latent-to-decoder mapping, “SketchODE”
enjoys inherent continuity

Inherently smooth latent space
• Due to continuous nature of latent-to-decoder mapping, “SketchODE”
enjoys inherent continuity
One-shot Interpolation
SketchODE
SketchODE
RNN-RNN
RNN-RNN

Abstraction effect by squeezing frequency

Abstraction effect by squeezing frequency
• We noticed a peculiar property, i.e. “Abstraction effect”

Abstraction effect by squeezing frequency
• We noticed a peculiar property, i.e. “Abstraction effect”
• Thanks to the periodic nature of activations and their frequencies

Abstraction effect by squeezing frequency
• We noticed a peculiar property, i.e. “Abstraction effect”
• Thanks to the periodic nature of activations and their frequencies
Decreasing frequency content

Abstraction effect by squeezing frequency
• We noticed a peculiar property, i.e. “Abstraction effect”
• Thanks to the periodic nature of activations and their frequencies
Decreasing frequency content
Refer to the paper for more details

Thank You !
Check out the project page @ https://ayandas.me/sketchode