Principal curves are parameterized curves passing “through the middle” of a data cloud. These objects constitute a way of generalization of the notion of first principal component in Principal Component Analysis. Several definitions of principal curve have been proposed, one of which can be expressed as a least-square minimization problem. In the present paper, adopting this definition, we study a Gaussian model selection method for choosing the length of the principal curve, in order to avoid interpolation, and obtain a related oracle-type inequality. The proposed method is practically implemented and illustrated on cartography problems.
"Selecting the length of a principal curve within a Gaussian model." Electron. J. Statist. 7 342 - 363, 2013. https://doi.org/10.1214/13-EJS775