Slide 1

Slide 1 text

Aaron Daniel Snowberger, Choong Ho Lee 어문청정 빅데이터 분석: 위문기거 일례 A Big Data Analysis of Yumentingzheng: Weiwenqiju as an Example 御 門 聽 政 慰 問 起 居

Slide 2

Slide 2 text

Introduction Yumentingzheng(御門聽政), which records the contents of the Qing dynasty's discussions with his subjects, is an important document like the Annals of Joseon in Korea. This paper describes the methods and steps for big data analysis of Yumentingzheng written in the Manchu alphabet. In big data analysis of documents written in Manchu characters, there are many problems that need to be solved in advance, and research on these should be preceded. In this paper, a method of big data analysis using the R language was proposed in the stage where the text written in Manchurian characters was transliterated into Latin characters through a preliminary study to be conducted in the future. In the proposed method, the Apkai method was adopted for the transliteration of Yumentingzheng, and the results of big data analysis were presented using the text of Weiwenqiju(慰問起居).

Slide 3

Slide 3 text

Text[1] that was already transcribed by the Möllendorf method was converted to the Abkai method, and then big data analysis was performed with R on the frequency of words appearing in the text. Möllendorf Abkai Möllendorf Abkai Manchu 0 1 š x ᡡ 0 2 c q ᡡ 0 3 ū v ᡡ

Slide 4

Slide 4 text

Manchu Text Latin Transliteration Manchu Möllendorf Abkai The original Manchu text was transcribed using the Möllendorf method. The transcription was edited to according to the Abkai method in the previous table. The resulting edited transcription was used in the big data analysis. 0 1 0 2

Slide 5

Slide 5 text

It is necessary to find meaningless words among the words displayed in large letters to remove in order to display the word cloud with meaningful and important words. 1st Wordcloud Analysis The words ‘be’ and ‘de’ appear the most often because nouns cannot be extracted individually. Word Korean English 0 1 be ~을,~를, ~로써, ~로 하여금 particles 0 2 de ~에, ~에서, ~에로, 에 대하여 at, to 0 3 amban 대신(大臣) Minister 0 4 kemuni 늘, 언제나 always 0 5 aniya 해, 년 year 0 6 yasa 눈(目) eye 0 7 udu ~라 할지라도 even though

Slide 6

Slide 6 text

It is necessary to find meaningless words among the words displayed in large letters to remove in order to display the word cloud with meaningful and important words. 2nd Wordcloud Analysis Word Korean English 0 1 yasa 눈(目) eye 0 2 amban 대신(大臣) Minister 0 3 kemuni 늘, 언제나 always 0 4 aniya 해, 년 year 0 5 udu ~라 할지라도 even though With the particles ‘be’ and ‘de’ removed, as well as other pronouns such as ‘bi’, ‘si’, ‘udu’, and ‘mini’, we can find the most important words.

Slide 7

Slide 7 text

Wordcloud2 Analysis The R package wordcloud2 produces a wordcloud with the ability to mouseover any word to see its frequency. (This requires sorting the table in decreasing order first.) Word Count yasa 10 qi 9 ye 9 se 9 ere 8 ba 7 amban 7 aniya 7 the 7 kemuni 6

Slide 8

Slide 8 text

Conclusion This paper presented a big data analysis method of literature written in Manchu characters. Since the Manchu dictionary package has not been provided in the R language until now, a word cloud was created with the frequency of words in the current state. The method of Romanizing Manchu characters was converted to the Apkai method without special symbols. In fact, the effectiveness of this method was demonstrated by conducting an experiment with the Weiwenqiju portion of the Yumentingzheng. 0 1 0 2 CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik

Slide 9

Slide 9 text

[1] Zhuang Jifa, Yumentingzheng, Wenshizhe Press, 2000. [2] Manchu alphabet. [Internet] Available: https://en.wikipedia.org/wiki/Manchu_alphabet [3] Diandian Zhang, Yan Liu, Zhuowei Wang, and Depei Wang, "OCR with the Deep CNN Model for Ligature Script-Based Languages like Manchu," Hindawi Scientific Programming, vol. 2021, Article ID 5520338, https://doi.org/10.1155/2021/5520338 [4] Jang Yongsik, Kang Higu, Learning to code in R language, Saengneung Press, 2018. [5] Manchu Language. [Internet] Available: https://namu.wiki/w/%EB%A7%8C%EC%A3%BC%EC%96%B4 References CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik