Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning for Classical Japanese Literature

Kurian Benoy
August 20, 2019

Deep Learning for Classical Japanese Literature

This was presented as part of my seminar for Undergraduate BTECH S8 topic. This is a brief summary of the paper: https://arxiv.org/abs/1812.01718

Kurian Benoy

August 20, 2019
Tweet

More Decks by Kurian Benoy

Other Decks in Research

Transcript

  1. Deep Learning
    for Classical
    Japaneses
    Literature
    Kurian Benoy
    CS-7 A
    34

    View Slide

  2. Contents
    ● Abstract
    ● Introduction
    ● Kuzushiji Dataset
    ● Classification Baselines
    ● Domain Transfer from Kuzushiji-Kanji to Modern Kanji
    ● Similar work in chinese
    ● Why don’t we need domain transfer for Malayalam
    ● Summary

    View Slide

  3. Abstract
    ● To encourage ML researchers to produce models for Social or Cultural
    relevance to transcribe Kuzushiji into contemporary Japanese characters.
    ● To release Kuzushiji MNIST dataset, Kuzushiji 49 and Kuzushiji-Kanji datasets
    to general public.
    ● Written by Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex
    Lamb, Kazuaki Yamamoto, David Ha.
    https://arxiv.org/abs/1812.01718

    View Slide

  4. Introduction
    Land of Rising Sun-Japan

    View Slide

  5. Introduction
    ● Historically, Japan and its culture had been isolated from the west for a long
    period of time. Until the Meiji restoration in 1868, when a 15 year old emperor
    brought unity to whole of Japan which was earlier broken down into regional
    small rulers.
    ● This caused a massive change in Japanese Language, writing and printing
    system. Even though Kuzushiji had been used for over 1000 years there are
    very few fluent readers of Kuzushiji today (only 0.01% of modern Japanese
    natives).

    View Slide

  6. Introduction
    So now most Japan natives cannot read books written and published over 150 years
    ago. In General Catalog of National Books, there is over 1.7 million books and about
    3 millions unregistered books yet to be found. It's estimated that there are around a
    billion historical documents written in Kuzushiji language over a span of centuries.
    Most of this knowledge is now inaccessible to general public.
    .

    View Slide

  7. Kuzushiji Dataset
    Fig on left: Kuzhushiji(old Japanese)
    Fig on right: Modern day contempary japanese

    View Slide

  8. Kuzushiji Dataset
    The Japanese language can be divided into two types of systems:
    ● Logographic systems, where each character represents a word or a phrase (with
    thousands of characters). A prominent logographic system is Kanji, which is
    based on the Chinese System.
    ● Syllabary symbol systems, where words are constructed from syllables (similar
    to an alphabet). A prominent syllabary system is Hiragana with 49 characters
    (Kuzushiji-49), which prior to the Kuzushiji standardization had several
    representations for each Hiranaga character.

    View Slide

  9. a) Kuzhushiji MNIST:
    ● MNIST for handwritten digits is one of the most popular dataset's till and is usually
    the hello world for Deep Learning.
    ● Yet there are fewer than 49 letters needed to fully represent Kuzushiji Hirangana.

    View Slide

  10. ● Since MNIST restricts us to 10 classes, we chose one character to represent
    each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.
    ● Kuzushiji MNIST is more difficult compared to MNIST because for each image
    the chance for a human to detect characters correctly when a single image is of
    small size and is stacked together of 5 rows is very less.

    View Slide

  11. b) Kuzhushiji 49
    ● As the name suggest, it is a much larger imbalanced dataset containing 49
    hirangana characters with about 266,407 images.
    ● Both Kuzushiji-49 and Kuzushiji-MNIST consists of `grey images of 28x28 pixel
    resolution`.
    ● The training and test is split in ratio of 6/7 to 1/7 for each classes.
    ● There are several rare characters with small no of samples such as (e) in
    hiragana has only 456 images.

    View Slide

  12. View Slide

  13. c) Kuzushiji Kanji:
    ● Kuzushiji Kanji has a total of 3832 classes of characters in this dataset with
    about 140,426 images.
    ● Kuzushiji-Kanji images are are of larger 64x64 pixel resolution and the number
    of samples per class range from over a thousand to only one sample.

    View Slide

  14. To download the dataset:
    https://github.com/rois-codh/kmnist

    View Slide

  15. Classification Baselines
    This research paper focussed on calculating the accuracy of recognising Kuzushiji
    datasets which in both Kanji and Hiragana, based on pre-processed images of
    characters from 35 books from the 18th century.

    View Slide

  16. Even you can improve the results. The current state of art model according to
    ROIS-CODH is a combination of Resnet18+VGG ensemble over capsule networks.

    View Slide

  17. PreAct Resnet with ManiFold mixup
    ● A method for learning better representations, that acts as a regularizer and
    despite its no significant additional computation cost , achieves improvements
    over strong baselines on Supervised and Semi-supervised Learning tasks.
    ● Manifold Mixup is that the dimensionality of the hidden states exceeds the
    number of classes, which is often the case in practice.
    Resnet Ensembled over Capsule Networks
    ● Ensemble of Resnet and VGG
    ● Ensembling Resnets with Capsule networks

    View Slide

  18. My intuition
    ● EfficentNet coupled with Capsule networks

    View Slide

  19. Domain Transfer
    ● Our proposed model should transfer the pixel image from a given
    Kuzushiji-Kanji input, to a vector image of Modern Kanji version.

    View Slide

  20. Algorithm
    1. Train two seperate variational autoencoder on pixel version of KanjiVG and
    Kuzushiji-Kanji on 64x64px resolution.
    2. Train mixture density network to mode P(Znew | Zold) as mixture of
    gaussians.
    3. Train sketch RNN to generate Kanji VGG strokes conditioned on either znew
    or z~new ~P(Znew|Zold).

    View Slide

  21. View Slide

  22. Components of this network
    ● Auto Encoders and Decoders
    They are widely used unsupervised application of neural networks whose
    original purpose is to find latent lower dimensional state-spaces of datasets, but
    they are also capable of solving other problems, such as image denoising,
    enhancement or colourization.
    ● Variational Autoencoders is used to provide latent space of KanjiVG to
    Kuzushiji Kanji. It’s used in the architecture to finetune the input and provide
    better colourization and enhancement. It’s used in complex generative models.

    View Slide

  23. ● Mixture Density Network:
    Used to model density function to a new domain. It’s used for making the neural
    networks to translate from Kuzushiji Kanji to KanjiVG format in pixels.
    ● Sketch RNN
    It’s a decoder network which conditions the model in a new latent vector.

    View Slide

  24. Comparison with Chinese Kanji
    ● Training two VAE encoders in our algorithm gives better performance than
    single VAE encoders used.
    ● Sketch-RNN is better than char-RNN to give a better accuracy.
    ● Using adversarial losses as in other approaches is not necessary.
    http://otoro.net/kanji-rnn/

    View Slide

  25. View Slide

  26. Why not such a system for Malayalam?
    ● We also have a similar problem of domain transfer
    ● This was due to a government rule in 1956 which limited the typography for
    malayalam as 56 characters only

    View Slide

  27. ● Free software community namely Swathanthra Malayalam computing have
    already created mappings for 1200 characters in Malayalam.
    Unicode

    View Slide

  28. Summary
    ● Explored the deep learning technique for classifying Classical Japanese,
    Kuzushiji and do the domain transfer to Contemporary Japanese Language.
    ● Looked the various Kuzushiji datasets

    View Slide

  29. Thanks
    ● Slides: bit.ly/japanslides
    ● Brief summary: https://kurianbenoy.github.io/
    ● Research paper:
    https://arxiv.org/pdf/1812.01718.pdf

    View Slide