Open Access
November 2018 Estimating the probabilities of misclassification using CV when the dimension and the sample sizes are large
Tomoyuki Nakagawa
Hiroshima Math. J. 48(3): 373-411 (November 2018). DOI: 10.32917/hmj/1544238034

Abstract

In this paper, we study about estimating the probabilities of misclassification in the high-dimensional data. In many cases, the cross-validation (CV) is often used for estimations of the probabilities of misclassification. CV provides a nearly unbiased estimate, using the original data when the sample sizes are large. On the other hand, the properties of CV are not well-known when the dimension is large as compared to the sample sizes. Therefore, we investigate asymptotic properties of CV when the dimension and the sample sizes tend to be large. Furthermore, we suggest the three methods for correcting the bias by using CV which is usable in the high-dimensional data. We show performances of the estimators in the simulation studies.

Citation

Download Citation

Tomoyuki Nakagawa. "Estimating the probabilities of misclassification using CV when the dimension and the sample sizes are large." Hiroshima Math. J. 48 (3) 373 - 411, November 2018. https://doi.org/10.32917/hmj/1544238034

Information

Received: 30 November 2017; Revised: 24 September 2018; Published: November 2018
First available in Project Euclid: 8 December 2018

zbMATH: 07032364
MathSciNet: MR3885268
Digital Object Identifier: 10.32917/hmj/1544238034

Subjects:
Primary: 62H30
Secondary: 62H12

Keywords: asymptotic expansion , ‎classification‎ , cross-validation , discriminant analysis , high-dimensional , probability of misclassification

Rights: Copyright © 2018 Hiroshima University, Mathematics Program

Vol.48 • No. 3 • November 2018
Back to Top