= doc_count ni : ΠϯσοΫεશମͰλʔϜ͕ݱΕΔΤϯτϦʔͷ = doc_freq Papineni, K. (2001). “Why Inverse Document Frequency?” In Proc. of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), pp. 25–32.
/ AP / MAP MAP ʹΑΔ࠷దԽ • P@n; Precision at n ୈ n Ґ·Ͱͷద߹ • AP; Average Precision P@n Λ n ·ͰͰฏۉͨ͠ࢦඪ • MAP; Mean Average Precision AP Λͯ͢ͷΤϯτϦʔͰฏۉ Max MAP : 0.9173 ----------------------- IDF threshold : 6.0 RIDF threshold : 0.55 Gain threshold : 0.0 Α͍͜ަࠩݕఆ͠·͠ΐ͏
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press, New York, 1999. Recall, Precision, P@n, AP, MAP, Binary heap ʹΑΔ Top-K ͳͲ Büttcher S, Clarke C, Cormack GV.Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press, 2010. ϙΞιϯϞσϧ, IDF, RIDF ͳͲ Manning, C. D., & Schutze, H. Foundations of statistical natural language processing. The MIT Press, 1999. IDF, RIDF ʹΑΔ Dynamic stop word list Amati, G., Carpineto, C., Romano, G. (Eds.). Advances in Information Retrieval, 29th European Conference on IR Research, ECIR 2007, Rome, Italy, April 2-5, 2007, Proceedings. Lecture Notes in Computer Science Springer Volume 4425, 2007. IDF, RIDF ʹΑΔࡧҾޠͷॏΈ͚ , ా, ࢰʑງ. ใݕࡧΞϧΰϦζϜ, ڞཱग़൛, 2002.