Projection pursuit regression and kernel regression are methods for estimating a smooth function of several variables from noisy data obtained at scattered sites. Methods based on local averaging can perform poorly in high dimensions (curse of dimensionality). Intuition and examples have suggested that projection based approaches can provide better fits. For what sorts of regression functions is this true? When and by how much do projection methods reduce the curse of dimensionality? We make a start by focusing on the two-dimensional problem and study the $L^2$ approximation error (bias) of the two procedures with respect to Gaussian measure. Let RA stand for a certain PPR-type approximation and KA for a particular kernel-type approximation. Building on a simple but striking duality for polynomials, we show that RA behaves significantly better than the minimax rate of approximation for radial functions, while KA performs significantly better than the minimax rate for harmonic functions. In fact, the rate improvements carry over to large classes, RA behaving very well for functions with enough angular smoothness (oscillating slowly with angle), while KA behaves very well for functions with enough Laplacian smoothness, (oscillations averaging out locally). The rate improvements matter: They are equivalent to lowering the dimensionality of the problem. For example, for functions with nice tail behavior, RA behaves as if the dimensionality of the problem were 1.5 rather than its nominal value 2. Also, RA and KA are complementary: For a given function, if one method offers a dimensionality reduction, the other does not.
"Projection-Based Approximation and a Duality with Kernel Methods." Ann. Statist. 17 (1) 58 - 106, March, 1989. https://doi.org/10.1214/aos/1176347004