Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SketchODE: Learning neural sketch representation in continuous time

27e6303669f854882d672f3cd3fcb796?s=47 Ayan Das
March 24, 2022

SketchODE: Learning neural sketch representation in continuous time

Learning meaningful representations for chirographic drawing data such as sketches, handwriting, and flowcharts is a gateway for understanding and emulating human creative expression. Despite being inherently continuous-time data, existing works have treated these as discrete-time sequences, disregarding their true nature. In this work, we model such data as continuous-time functions and learn compact representations by virtue of Neural Ordinary Differential Equations. To this end, we introduce the first continuous-time Seq2Seq model and demonstrate some remarkable properties that set it apart from traditional discrete-time analogues. We also provide solutions for some practical challenges for such models, including introducing a family of parameterized ODE dynamics & continuous-time data augmentation particularly suitable for the task. Our models are validated on several datasets including VectorMNIST, DiDi and Quick, Draw!.

27e6303669f854882d672f3cd3fcb796?s=128

Ayan Das

March 24, 2022
Tweet

More Decks by Ayan Das

Other Decks in Research

Transcript

  1. SketchODE: Learning neural sketch representation in continuous time Ayan Das1,2,

    Yongxin Yang1,3, Timothy Hospedales1,3, Tao Xiang1,2, Yi-Zhe Song1,2 1SketchX, CVSSP, University of Surrey, UK 2iFlyTek-Surrey Joint research centre on AI 3University of Edinburgh, UK Accepted as poster @ ICLR ‘22
  2. Chirographic Data: Handwriting, Sketches etc.

  3. Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence

    of discrete points
  4. Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence

    of discrete points
  5. Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence

    of discrete points • Disregards the true nature, which is continuous
  6. Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence

    of discrete points • Disregards the true nature, which is continuous
  7. Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence

    of discrete points • Disregards the true nature, which is continuous QuickDraw
  8. Chirographic Data: Handwriting, Sketches etc. • Usually represented as sequence

    of discrete points • Disregards the true nature, which is continuous QuickDraw VectorMNIST* * VectorMNIST (newly introduced), vectorized version of MNIST, check https://ayandas.me/sketchode
  9. Representing continuous time strokes • Previous approaches

  10. Representing continuous time strokes • Previous approaches • Bezier curves

    [2] Image taken from [2]
  11. Representing continuous time strokes • Previous approaches • Bezier curves

    [2] • Differential Geometry [1] • etc … [1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020. [2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020. Image taken from [2] Image taken from [1]
  12. Representing continuous time strokes • Previous approaches • Bezier curves

    [2] • Differential Geometry [1] • etc … [1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020. [2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020. Image taken from [2] Image taken from [1] Bezier curve based stroke + Autoregressive Generation
  13. Representing continuous time strokes • Previous approaches • Bezier curves

    [2] • Differential Geometry [1] • etc … [1] Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. NeurIPS, 2020. [2] Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. BezierSketch: A generative model for scalable vector sketches. ECCV, 2020. Image taken from [2] Image taken from [1] Diff. Geometry based stroke + Autoregressive Generation Bezier curve based stroke + Autoregressive Generation
  14. Entire chirographic structure as one function

  15. Entire chirographic structure as one function • Chirographic structures (including

    pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.
  16. Entire chirographic structure as one function [1] Tian Qi Chen,

    Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018. • Chirographic structures (including pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1]
  17. Entire chirographic structure as one function • Chirographic structures (including

    pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.
  18. Entire chirographic structure as one function • Chirographic structures (including

    pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018.
  19. Entire chirographic structure as one function • Chirographic structures (including

    pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018. Model
  20. Entire chirographic structure as one function • Chirographic structures (including

    pen-up events) as one continuous- time function 𝑠(𝑡) and model it’s derivative using Neural ODE [1] [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. In NeurIPS, 2018. Model Solution/Inference
  21. Learn latent representations for continuous time functions

  22. Learn latent representations for continuous time functions • Use “Neural

    CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.
  23. Learn latent representations for continuous time functions • Use “Neural

    CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020.
  24. Learn latent representations for continuous time functions • Use “Neural

    CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020. Augmented ODE
  25. Learn latent representations for continuous time functions • Use “Neural

    CDE” [1] to encoder the data and “Neural ODE” [2] to decode it – an Autoencoder setup. [1] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural Ordinary Differential Equations. NeurIPS, 2018. [2] Patrick Kidger, James Morrill, James Foster, and Terry J. Lyons. Neural controlled differential equations for irregular time series. NeurIPS, 2020. Augmented ODE Second order
  26. Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is

    important
  27. Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is

    important • Either represent pen-ups as straight lines with a state bit
  28. Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is

    important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes
  29. Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is

    important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes
  30. Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is

    important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes Final state of the previous stroke is mapped to initial state of the next stroke (check the paper for more details)
  31. Full-sequence & Multi-stroke format • Exact format of 𝑠(𝑡) is

    important • Either represent pen-ups as straight lines with a state bit • Or, as a sequence of individual strokes Final state of the previous stroke is mapped to initial state of the next stroke (check the paper for more details) “events”
  32. Training/Implementation tricks

  33. Training/Implementation tricks • Sin/Cos activation for dynamics functions • High

    frequency temporal changes in trajectory
  34. Training/Implementation tricks • Sin/Cos activation for dynamics functions • High

    frequency temporal changes in trajectory • Continuous noise augmentation • More intuitive in continuous-time case
  35. Reconstruction & Generation

  36. Reconstruction & Generation • Faithful reconstruction, just like deterministic RNN-RNN

  37. Reconstruction & Generation • Faithful reconstruction, just like deterministic RNN-RNN

    • Generation by injecting noise into latent space • RNN-RNNs break continuity due to autoregression.
  38. Reconstruction & Generation • Faithful reconstruction, just like deterministic RNN-RNN

    • Generation by injecting noise into latent space • RNN-RNNs break continuity due to autoregression. One-shot Generation
  39. Inherently smooth latent space

  40. Inherently smooth latent space • Due to continuous nature of

    latent-to-decoder mapping, “SketchODE” enjoys inherent continuity
  41. Inherently smooth latent space • Due to continuous nature of

    latent-to-decoder mapping, “SketchODE” enjoys inherent continuity
  42. Inherently smooth latent space • Due to continuous nature of

    latent-to-decoder mapping, “SketchODE” enjoys inherent continuity One-shot Interpolation SketchODE SketchODE RNN-RNN RNN-RNN
  43. Abstraction effect by squeezing frequency

  44. Abstraction effect by squeezing frequency • We noticed a peculiar

    property, i.e. “Abstraction effect”
  45. Abstraction effect by squeezing frequency • We noticed a peculiar

    property, i.e. “Abstraction effect” • Thanks to the periodic nature of activations and their frequencies
  46. Abstraction effect by squeezing frequency • We noticed a peculiar

    property, i.e. “Abstraction effect” • Thanks to the periodic nature of activations and their frequencies Decreasing frequency content
  47. Abstraction effect by squeezing frequency • We noticed a peculiar

    property, i.e. “Abstraction effect” • Thanks to the periodic nature of activations and their frequencies Decreasing frequency content Refer to the paper for more details
  48. Thank You ! Check out the project page @ https://ayandas.me/sketchode