ඌࣙྻ (SA) + ࠷ڞ௨಄ࣙྻ (LCP) • ςΩετ n ʹରͯ͠ 9n bytes [1] • ඌࣙ (20n bytes~) ΑΓίϯύΫτ [2] [1] D. Okanohara and J. Tsujii. 2009. Text Categorization with All Substring Features. In the SIAM International Conference on Data Mining (SDM). [2] M. I. Abouelhoda, S. Kurtz, and E. Ohlebusch. 2004. Replacing suf fi x trees with enhanced suf fi x arrays. J. Discrete Algs, 2:53–86.
LCP) Λߏங 2. BWT ͕มԽ͢ΔඌࣙΛνΣοΫ [2] 3. LCP Λͬͯ෦ϊʔυΛྻڍ • ͜ͷͱ͖ BWT ΛνΣοΫͯ͠ۃେ෦จࣈྻͷΈྻڍ͢Δ • ඌࣙͷ෦ϊʔυͷྻڍςΩετ T ʹରͯ͠ઢܗ࣌ؒͰ࣮ߦͰ͖Δ [1] • BWT ͷมԽͷνΣοΫઢܗ࣌ؒͰՄೳ [1] T. Kasai, G. Lee, H. Arimura, S. Arikawa and K. Park "LinearTime Longest-Common-Pre fi x Computation in Suf fi x Arrays and Its Applications", CPM 2001 [2] ۃେ෦จࣈྻ - Ξεϖه http://d.hatena.ne.jp/takeda25/20101202/1291269994
keyphrase extraction", Information retrieval 2.4 (2000): 303-336 • https://arxiv.org/pdf/cs/0212020.pdf • [Hasan & Ng 14] Kazi Saidul Hasan and Vincent Ng. "Automatic Keyphrase Extraction: A Survey of the State of the Art._ Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)" 2014, pages 1262-1273 • https://www.aclweb.org/anthology/P/P14/P14-1119.xhtml • ࣗಈΩʔϑϨʔζநग़ʹ͍ͭͯͷମܥతͳϨϏϡʔจ • [Liu+ 09] Z. Liu, P. Li, Y. Zheng and M. Sun. "Clustering to fi nd exemplar terms for keyphrase extraction", 2009, pp. 257–266 • ީิΩʔϑϨʔζΛͭ͘Δͱ͖ɺετοϓϫʔυͷࣙॻΛͬͯετοϓϫʔυΛ͡ ͍͍ͯΔ
NL187 ࣗવݴޠॲཧݚڀձ 2008 • http://ci.nii.ac.jp/naid/110006980330 • [Okanohara & Tsujii 09] D. Okanohara and J. Tsujii. "Text Categorization with All Substring Features", In the SIAM International Conference on Data Mining (SDM) 2009 • http://epubs.siam.org/doi/abs/10.1137/1.9781611972795.72 • [Abouelhoda+ 04] M. I. Abouelhoda, S. Kurtz, and E. Ohlebusch. "Replacing suf fi x trees with enhanced suf fi x arrays.", J. Discrete Algs 2004, 2:53–86. • https://pdfs.semanticscholar.org/4ca9/ ea95a0a9846965e86619e646d9ca36930c18.pdf • [Kasai+ CPM 01] T. Kasai, G. Lee, H. Arimura, S. Arikawa and K. Park "LinearTime Longest-Common-Pre fi x Computation in Suf fi x Arrays and Its Applications", CPM 2001 ࢀߟจݙ