### The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators

Stephen Portnoy and Roger Koenker
Source: Statist. Sci. Volume 12, Number 4 (1997), 279-300.

#### Abstract

Since the time of Gauss, it has been generally accepted that $\ell_2$-methods of combining observations by minimizing sums of squared errors have significant computational advantages over earlier $\ell_1$-methods based on minimization of absolute errors advocated by Boscovich, Laplace and others. However, $\ell_1$-methods are known to have significant robustness advantages over $\ell_2$-methods in many applications, and related quantile regression methods provide a useful, complementary approach to classical least-squares estimation of statistical models. Combining recent advances in interior point methods for solving linear programs with a new statistical preprocessing approach for $\ell_1$-type problems, we obtain a 10- to 100-fold improvement in computational speeds over current (simplex-based) $\ell_1$-algorithms in large problems, demonstrating that $\ell_1$-methods can be made competitive with $\ell_2$-methods in terms of computational speed throughout the entire range of problem sizes. Formal complexity results suggest that $\ell_1$-regression can be made faster than least-squares regression for n sufficiently large and p modest.

First Page:

#### Related Works:

Full-text: Open access

Permanent link to this document: http://projecteuclid.org/euclid.ss/1030037960
Mathematical Reviews number (MathSciNet): MR1619189
Digital Object Identifier: doi:10.1214/ss/1030037960
Zentralblatt MATH identifier: 0955.62608

### References

ney, A., Ostrouchov, S. and Sorenson, D. (1995). LAPACK Users' Guide. SIAM, Philadelphia.
Barrodale, I. and Roberts, F. D. K. (1974). Solution of an overdetermined sy stem of equations in the 1 norm. Communications of the ACM 17 319-320.
Bartels, R. and Conn, A. (1980). Linearly constrained discrete 1 problems. ACM Trans. Math. Software 6 594-608.
Mathematical Reviews (MathSciNet): MR82a:90165
Digital Object Identifier: doi:10.1145/355921.355930
Bloomfield, P. and Steiger, W. L. (1983). Least Absolute Deviations: Theory, Applications, and Algorithms. Birkh¨auser, Boston.
Buchinsky, M. (1994). Changes in US wage structure 1963-87: an application of quantile regression. Econometrica 62 405- 458.
Buchinsky, M. (1995). Quantile regression, the Box-Cox transformation model and U.S. wage structure 1963-1987. J. Econometrics 65 109-154.
Chamberlain, G. (1994). Quantile regression, censoring and the structure of wages. In Advances in Econometrics (C. Sims, ed.). North-Holland, Amsterdam.
Mathematical Reviews (MathSciNet): MR1278270
Chambers, J. M. (1992). Linear models. In Statistical Models in S (J. M. Chambers and T. J. Hastie, eds.) 95-144. Wadsworth, Pacific Grove, CA.
Charnes, A., Cooper, W. W. and Ferguson, R. O. (1955). Optimal estimation of executive compensation by linear programming. Management Science 1 138-151.
Mathematical Reviews (MathSciNet): MR17,507a
Digital Object Identifier: doi:10.1287/mnsc.1.2.138
Chaudhuri, P. (1992). Generalized regression quantiles. In Proceedings of the Second Conference on Data Analy sis Based on the L1 Norm and Related Methods 169-186. North-Holland, Amsterdam.
Mathematical Reviews (MathSciNet): MR1214831
Chen, S. and Donoho, D. L. (1995). Atomic decomposition by basis pursuit. SIAM J. Sci. Stat. Comp. To appear.
Dikin, I. I. (1967). Iterative solution of problems of linear and quadratic programming. Soviet Math. Dokl. 8 674-675.
Mathematical Reviews (MathSciNet): MR36:4902
Zentralblatt MATH: 0189.19504
Edgeworth, F. Y. (1887). On observations relating to several quantities. Hermathena 6 279-285.
Edgeworth, F. Y. (1888). On a new method of reducing observations relating to several quantities. Philosophical Magazine 25 184-191.
Fan, J. and Gijbels, I. (1996). Local Poly nomial Modelling and Its Applications. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1383587
Zentralblatt MATH: 0873.62037
Fiacco, A. V. and McCormick, G. P. (1968). Nonlinear Programming: Sequential Unconstrained Minimization Techniques. Wiley, New York.
Mathematical Reviews (MathSciNet): MR39:5152
Floy d, R. W. and Rivest, R. L. (1975). Expected time bounds for selection. Communications of the ACM 18 165-173.
Frisch, R. (1956). La R´esolution des probl emes de programme lin´eaire par la m´ethode du potential logarithmique. Cahiers du S´eminaire d'Econometrie 4 7-20.
Gauss, C. F. (1821). Theoria combinationis observationum erroribus minimis obnoxiae: pars prior. [Translated (1995) by G. W. Stewart as Theory of the Combination of Observations Least Subject to Error. SIAM, Philadelphia.] Gill, P., Murray, W., Saunders, M., Tomlin, T. and Wright,
Mathematical Reviews (MathSciNet): MR1329543
M. (1986). On projected Newton barrier methods for linear programming and an equivalence to Karmarkar's projective method. Math. Programming 36 183-209.
Mathematical Reviews (MathSciNet): MR866988
Zentralblatt MATH: 0624.90062
Digital Object Identifier: doi:10.1007/BF02592025
Gonzaga, C. C. (1992). Path-following methods for linear programming. SIAM Rev. 34 167-224.
Mathematical Reviews (MathSciNet): MR93j:90050
Zentralblatt MATH: 0763.90063
Digital Object Identifier: doi:10.1137/1034048
Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1270012
Zentralblatt MATH: 0832.62032
Gutenbrunner, C. and Jure ckov´a, J. (1992). Regression quantile and regression rank score process in the linear model and derived statistics. Ann. Statist. 20 305-330. Gutenbrunner, C., Jure ckov´a, J., Koenker, R. and Portnoy, S.
(1993). Tests of linear hy potheses based on regression rank scores. J. Nonparametric Statist. 2 307-333.
Mathematical Reviews (MathSciNet): MR1256383
Digital Object Identifier: doi:10.1080/10485259308832561
Hall, P. and Sheather, S. (1988). On the distribution of a studentized quantile. J. Roy. Statist. Soc. Ser. B 50 381-391.
Zentralblatt MATH: 0674.62034
Mathematical Reviews (MathSciNet): MR970974
Karmarkar, N. (1984). A new poly nomial time algorithm for linear programming. Combinatorica 4 373-395.
Mathematical Reviews (MathSciNet): MR86i:90072
Zentralblatt MATH: 0557.90065
Digital Object Identifier: doi:10.1007/BF02579150
Koenker, R. (1994). Confidence intervals for regression quantiles. In Asy mptotic Statistics, Proceedings of the Fifth Prague Sy mposium (P. Mandl and M. Hu skov´a, eds.) 349-359. Springer, Heidelberg.
Mathematical Reviews (MathSciNet): MR1311953
Koenker, R. and Bassett, G. (1978). Regression quantiles. Econometrica 46 33-50.
Mathematical Reviews (MathSciNet): MR57:14279
Zentralblatt MATH: 0373.62038
Digital Object Identifier: doi:10.2307/1913643
Koenker, R. and d'Orey, V. (1987). Computing regression quantiles. J. Roy. Statist. Soc. Ser. C 36 383-393.
Koenker, R. and d'Orey, V. (1993). Computing dual regression quantiles and regression rank scores. J. Roy. Statist. Soc. Ser. C 43 410-414.
Koenker R., Ng, P. and Portnoy, S. (1994). Quantile smoothing splines. Biometrika 81 673-680.
Mathematical Reviews (MathSciNet): MR1326417
Zentralblatt MATH: 0810.62040
Digital Object Identifier: doi:10.1093/biomet/81.4.673
Laplace, P.-S. (1789). Sur quelques points du sy st eme du monde. M´emoires de l'Acad´emie des Sciences de Paris. (Reprinted in OEvres Compl´etes 11 475-558. Gauthier-Villars, Paris.)
Lustig, I. J., Marsden, R. E. and Shanno, D. F. (1992). On implementing Mehrotra's predictor-corrector interior-point method for linear programming. SIAM J. Optim. 2 435-449.
Mathematical Reviews (MathSciNet): MR93h:90059
Zentralblatt MATH: 0771.90066
Digital Object Identifier: doi:10.1137/0802022
Lustig, I. J., Marsden, R. E. and Shanno, D. F. (1994). Interior point methods for linear programming: computational state of the art (with discussion). ORSA J. Comput. 6 1-36.
Mathematical Reviews (MathSciNet): MR1261376
Zentralblatt MATH: 0798.90100
Digital Object Identifier: doi:10.1287/ijoc.6.1.1
Manning, W., Blumberg, L. and Moulton, L. H. (1995). The demand for alcohol: the differential response to price. J. Health Economics 14 123-148.
Mehrotra, S. (1992). On the implementation of a primal-dual interior point method. SIAM J. Optim. 2 575-601.
Mathematical Reviews (MathSciNet): MR93g:90047
Zentralblatt MATH: 0773.90047
Digital Object Identifier: doi:10.1137/0802028
Meketon, M. S. (1986). Least absolute value regression. Technical report, Bell Labs, Holmdel, NJ.
Mizuno, S., Todd, M. J. and Ye, Y. (1993). On adaptive-step primal dual interior point algorithms for linear programming. Math. Oper. Res. 18 964-981.
Mathematical Reviews (MathSciNet): MR94i:90072
Zentralblatt MATH: 0810.90091
Digital Object Identifier: doi:10.1287/moor.18.4.964
Oja, H. (1983). Descriptive statistics for multivariate distributions. Statist. Probab. Lett. 1 327-332.
Mathematical Reviews (MathSciNet): MR85a:62091
Zentralblatt MATH: 0517.62051
Portnoy, S. (1991). Asy mptotic behavior of the number of regression quantile breakpoints. SIAM Journal of Scientific and Statistical Computing 12 867-883.
Mathematical Reviews (MathSciNet): MR92f:62088
Digital Object Identifier: doi:10.1137/0912047
Powell, J. L. (1986). Censored regression quantiles. J. Econometrics 32 143-155.
Mathematical Reviews (MathSciNet): MR88e:62246
Digital Object Identifier: doi:10.1016/0304-4076(86)90016-3
Renegar, J. (1988). A poly nomial-time algorithm based on Newton's method for linear programming. Math. Programming 40 59-93.
Mathematical Reviews (MathSciNet): MR89b:90130
Zentralblatt MATH: 0654.90050
Digital Object Identifier: doi:10.1007/BF01580724
Shamir, R. (1993). Probabilistic analysis in linear programming. Statist. Sci. 8 57-64.
Mathematical Reviews (MathSciNet): MR1194444
Zentralblatt MATH: 0768.90054
Siddiqui, M. (1960). Distribution of quantiles in samples from a bivariate population. J. Res. Nat. Bur. Stand. B 64 145- 150.
Mathematical Reviews (MathSciNet): MR25:4591
Sonnevend, G., Stoer, J. and Zhao, G. (1991). On the complexity of following the central path of linear programs by linear extrapolation II. Math. Programming 52 527-553.
Mathematical Reviews (MathSciNet): MR93b:90044
Zentralblatt MATH: 0742.90056
Digital Object Identifier: doi:10.1007/BF01582904
Stigler, S. M. (1984). Boscovich, Simpson and a 1760 manuscript note on fitting a linear relation. Biometrika 71 615-620.
Mathematical Reviews (MathSciNet): MR775409
Digital Object Identifier: doi:10.1093/biomet/71.3.615
Stigler, S. M. (1986). The History of Statistics: Measurement of Uncertainty before 1900. Harvard Univ. Press.
Mathematical Reviews (MathSciNet): MR852410
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. C 58 267-288.
Mathematical Reviews (MathSciNet): MR96j:62134
Vanderbei, R. J., Meketon, M. J. and Freedman, B. A. (1986). A modification of Karmarkar's linear programming algorithm. Algorithmica 1 395-407.
Mathematical Reviews (MathSciNet): MR88e:90052
Zentralblatt MATH: 0626.90056
Digital Object Identifier: doi:10.1007/BF01840454
Wagner, H. M. (1959). Linear programming techniques for regression analysis. J. Amer. Statist. Assoc. 54 206-212.
Zentralblatt MATH: 0088.35702
Mathematical Reviews (MathSciNet): MR130753
Digital Object Identifier: doi:10.2307/2282146
Welsh, A. H. (1996). Robust estimation of smooth regression and spread functions and their derivatives. Statist. Sinica 6 347-366.
Mathematical Reviews (MathSciNet): MR97f:62076
Zentralblatt MATH: 0884.62047
Wright, M. H. (1992). Interior methods for constrained optimization. Acta Numerica 1 341-407.
Mathematical Reviews (MathSciNet): MR1165729
Zentralblatt MATH: 0766.65053
Zhang, Y. (1992). Primal-dual interior point approach for computing 1-solutions and -solutions of overdetermined linear sy stems. J. Optim. Theory Appl. 77 323-341.
Mathematical Reviews (MathSciNet): MR1221930
Zentralblatt MATH: 0796.49029
Digital Object Identifier: doi:10.1007/BF00940715
GAUSSIAN HARE, LAPLACIAN TORTOISE 297
borne, 1985). It can be computed by the fast median algorithm of Bloomfield and Steiger, for example. The Barrodale-Roberts approach is equivalent to using a comparison sort in this context and seems already sufficient to explain the O n2 behavior observed. Recently, Osborne and Watson (1996) have observed that the secant algorithm can be applied here and interpreted as an alternative to the usual median of three partitioning in the fast median computation. The improvement over Bloomfield and Steiger can be staggering in problems which arise in fitting a deterministic model in the presence of noise. For the record, the code distributed by Bartels, Conn and Sinclair used a heap sort in the linesearch implementation and was perhaps the first to improve on the O n2 asy mptotics. It would seem to be time that S-PLUS used a more modern implementation. 3. There is at least some folk law concerning the inferior performance of interior point methods when compared with simplex-sty le methods in postoptimality computations. However, this is the ty pe of computation employ ed when stud
GAUSSIAN HARE, LAPLACIAN TORTOISE 299
G ¨uler, O., den Hertog, D., Roos, C. and Terlaky, T. (1993). Degeneracy in interior point methods for linear programming: a survey. Ann. Oper. Res. 46 107-138.
Mathematical Reviews (MathSciNet): MR94j:90021
Zentralblatt MATH: 0785.90067
Digital Object Identifier: doi:10.1007/BF02096259
Kennedy, W. and Gentle, J. E., Jr. (1980). Statistical Computing. Dekker, New York.
Mathematical Reviews (MathSciNet): MR81g:62005
Zentralblatt MATH: 0435.62003
Monteiro, R. D. C. and Mehrotra, S. (1996). A general parametric analysis approach and its implications to sensitivity analysis in interior point methods. Math. Programming 72 65-82.
Mathematical Reviews (MathSciNet): MR97h:90072
Zentralblatt MATH: 0853.90083
Osborne, M. R. (1985). Finite Algorithms in Optimization and Data Analy sis. Wiley, New York.
Mathematical Reviews (MathSciNet): MR88f:65005
Zentralblatt MATH: 0573.65044
Osborne, M. R. and Watson, G. A. (1996). Aspects of Mestimation and l1 fitting. In Numerical Analy sis (D. F. Griffith and G. A. Watson, eds.). World Scientific, Singapore. Press, W., Flannery, B., Teukolsky, S. and Vetterling, W.
Mathematical Reviews (MathSciNet): MR1444626
(1986). Numerical Recipes: The Art of Scientific Computing. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR833288
Thisted, R. A. (1988). Elements of Statistical Computing. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR89h:65005
Zentralblatt MATH: 0663.62001