Estimation of misclassification probability for a distance-based classifier in high-dimensional data

Hiroki Watanabe; Masashi Hyodo; Yuki Yamada; Takashi Seo

doi:10.32917/hmj/1564106544

July 2019 Estimation of misclassification probability for a distance-based classifier in high-dimensional data

Hiroki Watanabe, Masashi Hyodo, Yuki Yamada, Takashi Seo

Hiroshima Math. J. 49(2): 175-193 (July 2019). DOI: 10.32917/hmj/1564106544

Abstract

We estimate the misclassification probability of a Euclidean distance-based classifier in high-dimensional data. We discuss two types of estimator: a plug-in type estimator based on the normal approximation of misclassification probability (newly proposed), and an estimator based on the well-known leave-one-out cross-validation method. Both estimators perform consistently when the dimension exceeds the total sample size, and the underlying distribution need not be multivariate normality. We also numerically determine the mean squared errors (MSEs) of these estimators in finite sample applications of high-dimensional scenarios. The newly proposed plug-in type estimator gives smaller MSEs than the estimator based on leave-one-out cross-validation in simulation.

Citation

Download Citation

Hiroki Watanabe. Masashi Hyodo. Yuki Yamada. Takashi Seo. "Estimation of misclassification probability for a distance-based classifier in high-dimensional data." Hiroshima Math. J. 49 (2) 175 - 193, July 2019. https://doi.org/10.32917/hmj/1564106544