Slide 39
Slide 39 text
KARAKURI Inc. All rights reserved. 39
参考⽂献
・Non-Deep関連
[8] An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics
(Victoria López et al, 2013)
データレベル,アルゴリズムレベル,ハイブリッド含めて多くのデータセットで精度検証.性能としてはEnsembleがベスト.
性能を下げる要因として,クラス不均衡以外の要因(ノイズ,クラス間の重なり etc.)についてもまとめて議論されている.
[9] Clustering-based undersampling in class-imbalanced data (Lin Wei-Chao et al., 2017)
[10] Confusion-Matrix-Based Kernel Logistic Regression for Imbalanced Data Classification (Miho Ohsaki et al., 2017)
[11] Experimental Perspectives on Learning from Imbalanced Data (Jason Van Hulse et al., 2007)
この論⽂ではトータルで⾒るとRandom Under Samplingが良かったが,ベストな⼿法はアルゴリズムや評価指標次第という結論.
[12] KRNN: k Rare-class Nearest Neighbour classification (Xiuzhen Zhang et al., 2017)
[13] Learning from Imbalanced Data in Presence of Noisy and Borderline Examples (Krystyna Napierała et al., 2010)
[14] Multiclass Imbalance Problems- Analysis and Potential Solutions (Shuo Wang and Xin Yao 2012)
[15] SMOTE Synthetic Minority Over-sampling Technique (N. V. Chawla et al., 2002)
[16] SMOTE‒IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering
(José A.Sáez et al., 2015)
[17] Under-sampling class imbalanced datasets by combining clustering analysis and instance selection (Chih-Fong Tsai et al., 2019)
[18] Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric (Sabri Boughorbel et al., 2017)
[19] Index of Balanced Accuracy: A Performance Measure for Skewed Class Distributions (V. Garc ́ıa et al, 2009)