In this paper, we develop a general theory for the convergence rate of sieve estimates, maximum likelihood estimates (MLE's) and related estimates obtained by optimizing certain empirical criteria in general parameter spaces. In many cases, especially when the parameter space is infinite dimensional, maximization over the whole parameter space is undesirable. In such cases, one has to perform maximization over an approximating space (sieve) of the original parameter space and allow the size of the approximating space to grow as the sample size increases. This method is called the method of sieves. In the case of the maximum likelihood estimation, an MLE based on a sieve is called a sieve MLE. We found that the convergence rate of a sieve estimate is governed by (a) the local expected values, variances and $L_2$ entropy of the criterion differences and (b) the approximation error of the sieve. A robust nonparametric regression problem, a mixture problem and a nonparametric regression problem are discussed as illustrations of the theory. We also found that when the underlying space is too large, the estimate based on optimizing over the whole parameter space may not achieve the best possible rates of convergence, whereas the sieve estimate typically does not suffer from this difficulty.
"Convergence Rate of Sieve Estimates." Ann. Statist. 22 (2) 580 - 615, June, 1994. https://doi.org/10.1214/aos/1176325486