A Big Data Analysis of Yumentingzheng

Aaron Daniel Snowberger, Choong Ho Lee 어문청정 빅데이터 분석: 위문기거
일례 A Big Data Analysis of Yumentingzheng: Weiwenqiju as an Example 御門聽政慰問起居

Introduction Yumentingzheng(御門聽政), which records the contents of the Qing dynasty's
discussions with his subjects, is an important document like the Annals of Joseon in Korea. This paper describes the methods and steps for big data analysis of Yumentingzheng written in the Manchu alphabet. In big data analysis of documents written in Manchu characters, there are many problems that need to be solved in advance, and research on these should be preceded. In this paper, a method of big data analysis using the R language was proposed in the stage where the text written in Manchurian characters was transliterated into Latin characters through a preliminary study to be conducted in the future. In the proposed method, the Apkai method was adopted for the transliteration of Yumentingzheng, and the results of big data analysis were presented using the text of Weiwenqiju(慰問起居).

Text[1] that was already transcribed by the Möllendorf method was
converted to the Abkai method, and then big data analysis was performed with R on the frequency of words appearing in the text. Möllendorf Abkai Möllendorf Abkai Manchu 0 1 š x ᡡ 0 2 c q ᡡ 0 3 ū v ᡡ

Manchu Text Latin Transliteration Manchu Möllendorf Abkai The original Manchu
text was transcribed using the Möllendorf method. The transcription was edited to according to the Abkai method in the previous table. The resulting edited transcription was used in the big data analysis. 0 1 0 2

It is necessary to ﬁnd meaningless words among the words
displayed in large letters to remove in order to display the word cloud with meaningful and important words. 1st Wordcloud Analysis The words ‘be’ and ‘de’ appear the most often because nouns cannot be extracted individually. Word Korean English 0 1 be ~을,~를, ~로써, ~로 하여금 particles 0 2 de ~에, ~에서, ~에로, 에 대하여 at, to 0 3 amban 대신(大臣) Minister 0 4 kemuni 늘, 언제나 always 0 5 aniya 해, 년 year 0 6 yasa 눈(目) eye 0 7 udu ~라 할지라도 even though

It is necessary to ﬁnd meaningless words among the words
displayed in large letters to remove in order to display the word cloud with meaningful and important words. 2nd Wordcloud Analysis Word Korean English 0 1 yasa 눈(目) eye 0 2 amban 대신(大臣) Minister 0 3 kemuni 늘, 언제나 always 0 4 aniya 해, 년 year 0 5 udu ~라 할지라도 even though With the particles ‘be’ and ‘de’ removed, as well as other pronouns such as ‘bi’, ‘si’, ‘udu’, and ‘mini’, we can ﬁnd the most important words.

Wordcloud2 Analysis The R package wordcloud2 produces a wordcloud with
the ability to mouseover any word to see its frequency. (This requires sorting the table in decreasing order ﬁrst.) Word Count yasa 10 qi 9 ye 9 se 9 ere 8 ba 7 amban 7 aniya 7 the 7 kemuni 6

Conclusion This paper presented a big data analysis method of
literature written in Manchu characters. Since the Manchu dictionary package has not been provided in the R language until now, a word cloud was created with the frequency of words in the current state. The method of Romanizing Manchu characters was converted to the Apkai method without special symbols. In fact, the effectiveness of this method was demonstrated by conducting an experiment with the Weiwenqiju portion of the Yumentingzheng. 0 1 0 2 CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik

[1] Zhuang Jifa, Yumentingzheng, Wenshizhe Press, 2000. [2] Manchu alphabet.
[Internet] Available: https://en.wikipedia.org/wiki/Manchu_alphabet [3] Diandian Zhang, Yan Liu, Zhuowei Wang, and Depei Wang, "OCR with the Deep CNN Model for Ligature Script-Based Languages like Manchu," Hindawi Scientiﬁc Programming, vol. 2021, Article ID 5520338, https://doi.org/10.1155/2021/5520338 [4] Jang Yongsik, Kang Higu, Learning to code in R language, Saengneung Press, 2018. [5] Manchu Language. [Internet] Available: https://namu.wiki/w/%EB%A7%8C%EC%A3%BC%EC%96%B4 References CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik

A Big Data Analysis of Yumentingzheng

A Big Data Analysis of Yumentingzheng

Aaron Snowberger

More Decks by Aaron Snowberger

Other Decks in Technology

Featured

Transcript

Aaron Daniel Snowberger, Choong Ho Lee 어문청정 빅데이터 분석: 위문기거

Introduction Yumentingzheng(御門聽政), which records the contents of the Qing dynasty's

Text[1] that was already transcribed by the Möllendorf method was

Manchu Text Latin Transliteration Manchu Möllendorf Abkai The original Manchu

It is necessary to ﬁnd meaningless words among the words

It is necessary to ﬁnd meaningless words among the words

Wordcloud2 Analysis The R package wordcloud2 produces a wordcloud with

Conclusion This paper presented a big data analysis method of

[1] Zhuang Jifa, Yumentingzheng, Wenshizhe Press, 2000. [2] Manchu alphabet.