Is evaluation based on accuracy of classification algorithms misleading? An approach to model validation using Bayes error rate

Document Type : Research Article

Authors

Faculty of Mathematics and Computer Science, Amirkabir University of Technology

Abstract

 Researchers have long regarded model accuracy as the primary metric for evaluating
 the performance of classification algorithms. The current evaluation approach, which relies solely
 on model accuracy, often leads to inappropriate evaluation of classifiers, regardless of the dataset’s
 separability and complexity. This limitation underscores the need for a new and more comprehen
sive method. We argue that accuracy-based evaluation can be misleading, even when considering
 measures of data separability and complexity. We compare the error rates of well-known classifiers
 on Gaussian-generated datasets and show that, paradoxically, many algorithms’ observed errors are
 lower than that of the theoretical optimal classifier, leading to an overestimation of their performance.
 We consider a model invalid if its error rate is lower than the optimal classifier error, known as the
 Bayes error rate. To identify such invalid models, we introduce a procedure and propose an algorithm
 for model validation based on the Bayes error rate.

Keywords

Main Subjects


[1] E.Alpaydin, Introduction to Machine Learning, Adaptive Computation and Machine Learning,
MIT Press, Third Edition, 2014.
[2] C.M. Bishop, Pattern Recognition and Machine Learning, Volume 4 of Information Science
and Statistics, Springer, 2006.
[3] L. Dalton, E. Dougherty, Optimal Bayesian Classification, Press Monograph Series, SPIE Press,
2020.
[4] R. Duda, P. Hart, D. Stork, Pattern Classification, Wiley, 2012.
[5] A. Fern´andez, S. Garc´ıa, M. Galar, R.C. Prati, B. Krawczyk, F. Herrera, Data Intrinsic Charac-
teristics, pages 253–277, Springer, 2018.
[6] K. Fukunaga, Introduction to Statistical Pattern Recognition, Chapter 10, Academic Press,
1990.
[7] S. Guan, M.H. Loew, A novel intrinsic measure of data separability, 52 (2022) 17734–17750.
[8] T.K. Ho, M. Basu, Complexity measures of supervised classification problems, IEEE Trans.
Pattern Anal. Mach. Intell. 24 (2002) 289–300.
[9] A. Izenman, Modern Multivariate Statistical Techniques: Regression, Classification, and Man-
ifold Learning, Springer Texts in Statistics. Springer, 2009.
[10] A.C. Lorena, L.P.F. Garcia, J. Lehmann, M.C.P. Souto, T.K. Ho, How complex is your clas-
sification problem? A survey on measuring classification complexity, ACM Comput. Surv. 52
(2019) 1–34.
[11] G.J. McLachlan, Discriminant Analysis and Statistical Pattern Recognition, Wiley, 2004.
[12] K. Murphy, Probabilistic Machine Learning: An Introduction, Adaptive Computation and Ma-
chine Learning series, MIT Press, 2022.
[13] M. Noshad, L. Xu, A. Hero, Learning to benchmark: Determining best achievable misclassifi-
cation error from training data, 2019, arXiv:1909.07192.
[14] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P.
Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M.
Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011)
2825–2830.
[15] Y. Peleg, Hungabunga: Brute-Force all sklearn models with all possible hyperparameters, and
rank using cross-validation, GitHub, Retrieved from https://github.com/ypeleg/HungaBunga,
2023.
[16] S. Theodoridis, Machine Learning: A Bayesian and Optimization Perspective, Elsevier, 2020.
[17] L. Wasserman, All of Statistics: A Concise Course in Statistical Inference, Springer, 2004.
[18] L. Xue, X. Zhang, W. Jiang, K. Huo, Q. Shen, A classification performance evaluation measure
considering data separability In L. Iliadis, A. Papaleonidas, P. Angelov, and C. Jayne, editors,
Artificial Neural Networks and Machine Learning – ICANN 2023, pages 1–13, Springer Nature
Switzerland, 2023.
[19] S. Yu, X. Li, Y. Feng, X. Zhang, S. Chen. An instance-oriented performance measure for clas-
sification. Inf. Sci. 580 (2021) 598–619.