Abstract
In this paper, we study about estimating the probabilities of misclassification in the high-dimensional data. In many cases, the cross-validation (CV) is often used for estimations of the probabilities of misclassification. CV provides a nearly unbiased estimate, using the original data when the sample sizes are large. On the other hand, the properties of CV are not well-known when the dimension is large as compared to the sample sizes. Therefore, we investigate asymptotic properties of CV when the dimension and the sample sizes tend to be large. Furthermore, we suggest the three methods for correcting the bias by using CV which is usable in the high-dimensional data. We show performances of the estimators in the simulation studies.
Citation
Tomoyuki Nakagawa. "Estimating the probabilities of misclassification using CV when the dimension and the sample sizes are large." Hiroshima Math. J. 48 (3) 373 - 411, November 2018. https://doi.org/10.32917/hmj/1544238034
Information