Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using a Simple ACGAN for Manchu Script Dataset Supplementation

Using a Simple ACGAN for Manchu Script Dataset Supplementation

Poster presentation: 2023 Korea Multimedia Society Spring Conference

Abstract: The Manchu script is difficult to process with deep learning techniques in part because there is not much data widely available for neural network training. Therefore, this research trained a simple ACGAN for the purpose of supplementing an existing, small-scale Manch dataset. A dataset consisting of a total of 4,000 Manchu script letters was trained in an ACGAN for 30,000 steps. The resulting images produced by the generator model were sufficiently recognizable for dataset supplementation.

Aaron Snowberger

May 19, 2023
Tweet

More Decks by Aaron Snowberger

Other Decks in Technology

Transcript

  1. Using a Simple ACGAN
    For Manchu Script
    Dataset Supplementation
    Aaron Snowberger • 이충호, 한밭대학교
    2023.05.19 한국멀티미디어학회 춘계학술대회

    View Slide

  2. Abstract
    만주어 스크립트는 부분적으로 신경망 훈련에 널리 사용 가능한 데이터가 많지 않기 때문에 딥 러닝
    기술로 처리하기 어렵다. 따라서 본 연구에서는 기존의 소규모 만주어 데이터 세트를 보완하기 위해
    간단한 ACGAN을 훈련하였다. 총 4,000개의 만주 문자로 구성된 데이터 세트는 ACGAN에서 30,000 단계
    동안 훈련되었다. 생성 모델에 의해 생성된 결과 이미지는 데이터 세트 보완을 위해 충분히 인식
    가능하였다.
    The Manchu script is difficult to process with deep learning techniques in part because there is
    not much data widely available for neural network training. Therefore, this research trained a
    simple ACGAN for the purpose of supplementing an existing, small-scale Manch dataset. A
    dataset consisting of a total of 4,000 Manchu script letters was trained in an ACGAN for 30,000
    steps. The resulting images produced by the generator model were sufficiently recognizable for
    dataset supplementation.
    Keywords:
    GAN | ACGAN | Manchu Script | Dataset Supplementation

    View Slide

  3. Introduction
    Difficulties in Preprocessing Unavailability of Datasets
    Manchu script is written vertically,
    with every letter of a word
    connected by a central stem. This
    makes segmentation of letters
    difficult for pre-processing.
    Large datasets of Manchu script are
    not widely available for machine
    learning. This is the problem this
    research paper addresses.

    View Slide

  4. Related Research
    Two preprocessing techniques Dataset in this Research
    Two main techniques for Manchu script
    recognition have been used. The first has
    been to attempt letter segmentation as a part
    of the pre-processing step[1,2], but this has
    proven difficult. The second has been to
    attempt segmentation-free recognition[3].
    This research utilizes an existing, small-scale
    dataset of segmented Manchu script letters[4]
    with over a dozen different handwriting styles.

    View Slide

  5. 01 02 03
    Generator Discriminator ACGAN
    System Model & Methods
    Layers:
    ● Input, dense, reshape
    ● Conv2DTranspose x4
    ○ BatchNormalization
    ○ ReLU
    ● Filters: [128, 64, 32, 1]
    ● Stride: 2 (last 2 layers: 1)
    ● Kernel size: 5
    Params: 1,332,161 trainable
    (704 untrainable)
    Layers:
    ● Input
    ● Conv2D x4
    ○ LeakyReLU
    ○ alpha = 0.2
    ● Dense x3
    ● Activation layer
    ● Filters: [32, 64, 128, 256]
    ● Stride: 2 (last Conv: 1)
    ● Kernel size: 5
    Params: 1,605,638 trainable
    (0 untrainable)
    Training:
    ● batch_size = 64
    ● latent_size = 100
    ● learning_rate = 2e-4
    ● decay = 6e-8
    ● RMSprop discriminator
    ● Loss functions:
    ○ Generator images:
    binary_crossentropy
    ○ Discriminator predictions:
    categorical_crossentropy
    ● Steps = 30,000

    View Slide

  6. Step 15000 Step 20000 Step 25000 Step 30000
    Step 500 Step 2500 Step 5000
    Results
    Step 10000

    View Slide

  7. Discussion & Conclusion
    ● As the figures indicate, generated images became progressively more accurate over time.
    However, some graininess and noise can also be seen in some of the later images. This is not
    an error in the training of the GAN, but a representation of the dataset. Because ancient
    handwritten Manchu texts were scanned and cropped to create the training dataset, there
    was some noise present in some of the image backgrounds.
    ● Therefore, in the creation of later Manchu datasets, a better image thresholding algorithm
    will be used to minimize the background noise. Nonetheless, the results of this ACGAN research
    have demonstrated the effectiveness of possibly supplementing small-scale Manchu datasets
    with generated images to bolster neural network training.

    View Slide

  8. 1. G. Y. Zhang, J. J. Li, A. X. Wang, “A New Recognition Method for the Handwritten Manchu Character Unit,” in
    Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, China 2006, pp.
    3339-3344, DOI: 10.1109/ICMLC.2006.258471.
    2. A. Snowberger, C.H. Lee, “A New Segmentation and Extraction Method for Manchu Character Units,” in
    Proceedings for 2022 International Conference on Future Information and Communication Engineering, Jeju, South
    Korea, pp. 42-47, 2022.
    3. R. Zheng, M. Li, J. He, J. Bi, and B. Wu, "Segmentation-Free Multi-Font Printed Manchu Word Recognition Using
    Deep Convolutional Features and Data Augmentation," in 2018 11th International Congress on Image and Signal
    Processing, BioMedical Engineering and Informatics (CISP-BMEI), Beijing, China, 2018, pp. 1-6, DOI:
    10.1109/CISP-BMEI.2018.8633208.
    4. A. Snowberger, C.H. Lee, “A Simple MNIST Style Dataset and CNN Training for Manchu Script Characters,” in
    Proceedings of the 15th International Conference on Future Information & Communication Engineering, Jeju, South
    Korea 2023, p. 136-138.
    5. R. Atienza, Advanced Deep Learning with TensorFlow 2 and Keras, 2nd ed., Packt Publishing Ltd., Birmingham, Feb.
    2020. [Online] Available: https://www.packtpub.com/book/programming/9781838821654/.
    References

    View Slide