Hiroshima Mathematical Journal

Estimation of misclassification probability for a distance-based classifier in high-dimensional data

Hiroki Watanabe, Masashi Hyodo, Yuki Yamada, and Takashi Seo

Full-text: Open access

Abstract

We estimate the misclassification probability of a Euclidean distance-based classifier in high-dimensional data. We discuss two types of estimator: a plug-in type estimator based on the normal approximation of misclassification probability (newly proposed), and an estimator based on the well-known leave-one-out cross-validation method. Both estimators perform consistently when the dimension exceeds the total sample size, and the underlying distribution need not be multivariate normality. We also numerically determine the mean squared errors (MSEs) of these estimators in finite sample applications of high-dimensional scenarios. The newly proposed plug-in type estimator gives smaller MSEs than the estimator based on leave-one-out cross-validation in simulation.

Article information

Source
Hiroshima Math. J., Volume 49, Number 2 (2019), 175-193.

Dates
Received: 24 August 2016
Revised: 23 January 2019
First available in Project Euclid: 26 July 2019

Permanent link to this document
https://projecteuclid.org/euclid.hmj/1564106544

Digital Object Identifier
doi:10.32917/hmj/1564106544

Mathematical Reviews number (MathSciNet)
MR3984991

Subjects
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 62H12: Estimation
Secondary: 62E20: Asymptotic distribution theory

Keywords
asymptotic approximations expected probability of misclassification linear discriminant function

Citation

Watanabe, Hiroki; Hyodo, Masashi; Yamada, Yuki; Seo, Takashi. Estimation of misclassification probability for a distance-based classifier in high-dimensional data. Hiroshima Math. J. 49 (2019), no. 2, 175--193. doi:10.32917/hmj/1564106544. https://projecteuclid.org/euclid.hmj/1564106544


Export citation