The Annals of Applied Statistics

Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique

Winston Lin

Full-text: Open access


Freedman [Adv. in Appl. Math. 40 (2008) 180–193; Ann. Appl. Stat. 2 (2008) 176–196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments, using Neyman’s model for randomization inference. Contrary to conventional wisdom, he argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This paper shows that in sufficiently large samples, those problems are either minor or easily fixed. OLS adjustment cannot hurt asymptotic precision when a full set of treatment–covariate interactions is included. Asymptotically valid confidence intervals can be constructed with the Huber–White sandwich standard error estimator. Checks on the asymptotic approximations are illustrated with data from Angrist, Lang, and Oreopoulos’s [Am. Econ. J.: Appl. Econ. 1:1 (2009) 136–163] evaluation of strategies to improve college students’ achievement. The strongest reasons to support Freedman’s preference for unadjusted estimates are transparency and the dangers of specification search.

Article information

Ann. Appl. Stat. Volume 7, Number 1 (2013), 295-318.

First available in Project Euclid: 9 April 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Analysis of covariance covariate adjustment randomization inference sandwich estimator robust standard errors social experiments program evaluation


Lin, Winston. Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique. Ann. Appl. Stat. 7 (2013), no. 1, 295--318. doi:10.1214/12-AOAS583.

Export citation


  • Angrist, J. D. and Imbens, G. W. (2002). Comment on “Covariance adjustment in randomized experiments and observational studies” by P. R. Rosenbaum. Statist. Sci. 17 304–307.
  • Angrist, J. D., Lang, D. and Oreopoulos, P. (2009). Incentives and services for college achievement: Evidence from a randomized trial. Am. Econ. J.: Appl. Econ. 1 136–163.
  • Angrist, J. D. and Pischke, J. S. (2009). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton Univ. Press, Princeton.
  • Ashenfelter, O. and Plant, M. W. (1990). Nonparametric estimates of the labor-supply effects of negative income tax programs. J. Labor Econ. 8 S396–S415.
  • Berk, R., Barnes, G., Ahlman, L. and Kurtz, E. (2010). When second best is good enough: A comparison between a true experiment and a regression discontinuity quasi-experiment. J. Exp. Criminol. 6 191–208.
  • Chamberlain, G. (1982). Multivariate regression models for panel data. J. Econometrics 18 5–46.
  • Chung, E. Y. and Romano, J. P. (2011a). Exact and asymptotically robust permutation tests. Technical Report 2011-05, Dept. Statistics, Stanford Univ.
  • Chung, E. Y. and Romano, J. P. (2011b). Asymptotically valid and exact permutation tests based on two-sample $U$-statistics. Technical Report 2011-09, Dept. Statistics, Stanford Univ.
  • Cochran, W. G. (1942). Sampling theory when the sampling-units are of unequal sizes. J. Amer. Statist. Assoc. 37 199–212.
  • Cochran, W. G. (1957). Analysis of covariance: Its nature and uses. Biometrics 13 261–281.
  • Cochran, W. G. (1969). The use of covariance in observational studies. J. R. Stat. Soc. Ser. C. Appl. Stat. 18 270–275.
  • Cochran, W. G. (1977). Sampling Techniques, 3rd ed. Wiley, New York.
  • Cox, D. R. and McCullagh, P. (1982). Some aspects of analysis of covariance. Biometrics 38 541–561.
  • Cox, D. R. and Reid, N. (2000). The Theory of the Design of Experiments. CRC Press, Boca Raton, FL.
  • Davidson, R. and MacKinnon, J. G. (1993). Estimation and Inference in Econometrics. Oxford Univ. Press, New York.
  • Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathematics 1. Cambridge Univ. Press, Cambridge.
  • Deaton, A. (2010). Instruments, randomization, and learning about development. J. Econ. Lit. 48 424–455.
  • Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I 59–82. Univ. California Press, Berkeley, CA.
  • Fienberg, S. E. and Tanur, J. M. (1987). Experimental and sampling structures: Parallels diverging and meeting. Internat. Statist. Rev. 55 75–96.
  • Fisher, R. A. (1932). Statistical Methods for Research Workers, 4th ed. Oliver and Boyd, Edinburgh.
  • Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, Edinburgh.
  • Freedman, D. A. (1991). Statistical models and shoe leather (with discussion). Socio. Meth. 21 291–358.
  • Freedman, D. A. (2006). On the so-called “Huber sandwich estimator” and “robust standard errors”. Amer. Statist. 60 299–302.
  • Freedman, D. A. (2008a). On regression adjustments to experimental data. Adv. in Appl. Math. 40 180–193.
  • Freedman, D. A. (2008b). On regression adjustments in experiments with several treatments. Ann. Appl. Stat. 2 176–196.
  • Freedman, D. A. (2008c). Editorial: Oasis or mirage? Chance 21(1) 59–61. Annotated references at
  • Freedman, D. A. (2010). Survival analysis: An epidemiological hazard? In Statistical Models and Causal Inference: A Dialogue with the Social Sciences (D. Collier, J. S. Sekhon and P. B. Stark, eds.) 169–192. Cambridge Univ. Press, Cambridge.
  • Freedman, D. A., Pisani, R. and Purves, R. (2007). Statistics, 4th ed. Norton, New York.
  • Fuller, W. A. (1975). Regression analysis for sample survey. Sankhyā Ser. C 37 117–132.
  • Fuller, W. A. (2002). Regression estimation for survey samples. Surv. Meth. 28 5–23.
  • Fuller, W. A. (2009). Sampling Statistics. Wiley, Hoboken, NJ.
  • Gail, M. H., Mark, S. D., Carroll, R. J., Green, S. B. and Pee, D. (1996). On design considerations and randomization-based inference for community intervention trials. Stat. Med. 15 1069–1092.
  • Green, D. P. and Aronow, P. M. (2011). Analyzing experimental data using regression: When is bias a practical concern? Working paper, Yale Univ.
  • Greenberg, D. and Shroder, M. (2004). The Digest of Social Experiments, 3rd ed. Urban Institute Press, Washington, DC.
  • Hansen, B. B. and Bowers, J. (2009). Attributing effects to a cluster-randomized get-out-the-vote campaign. J. Amer. Statist. Assoc. 104 873–885.
  • Hinkley, D. V. (1977). Jacknifing in unbalanced situations. Technometrics 19 285–292.
  • Hinkley, D. V. and Wang, S. (1991). Efficiency of robust standard errors for regression coefficients. Comm. Statist. Theory Methods 20 1–11.
  • Holland, P. W. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945–970.
  • Holt, D. and Smith, T. M. F. (1979). Post stratification. J. Roy. Statist. Soc. Ser. A 142 33–46.
  • Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I: Statistics 221–233. Univ. California Press, Berkeley, CA.
  • Imbens, G. W. (2010). Better LATE than nothing: Some comments on Deaton (2009) and Heckman and Urzua (2009). J. Econ. Lit. 48 399–423.
  • Imbens, G. W. and Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. J. Econ. Lit. 47 5–86.
  • Klar, N. and Darlington, G. (2004). Methods for modelling change in cluster randomization trials. Stat. Med. 23 2341–2357.
  • Kline, P. (2011). Oaxaca–Blinder as a reweighting estimator. Am. Econ. Rev. 101(3) 532–537.
  • Kline, P. and Santos, A. (2012). Higher order properties of the wild bootstrap under misspecification. J. Econometrics 171 54–70.
  • Lin, W. (2013). Supplement to “Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique.” DOI:10.1214/12-AOAS583SUPP.
  • Lin, W., Robins, P. K., Card, D., Harknett, K. and Lui-Gurr, S. (1998). When Financial Incentives Encourage Work: Complete 18-Month Findings from the Self-Sufficiency Project. Social Research and Demonstration Corp., Ottawa.
  • Lumley, T. (2010). Complex Surveys: A Guide to Analysis Using R. Wiley, Hoboken, NJ.
  • MacKinnon, J. G. (2013). Thirty years of heteroskedasticity-robust inference. In Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis: Essays in Honor of Halbert L. White Jr. (X. Chen and N. R. Swanson, eds.) 437–461. Springer, New York.
  • MacKinnon, J. G. and White, H. (1985). Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. J. Econometrics 29 305–325.
  • Meyer, B. D. (1995). Lessons from the U.S. unemployment insurance experiments. J. Econ. Lit. 33 91–131.
  • Middleton, J. A. and Aronow, P. M. (2012). Unbiased estimation of the average treatment effect in cluster-randomized experiments. Working Paper, Yale Univ.
  • Miller, R. G. Jr. (1986). Beyond ANOVA, Basics of Applied Statistics. Wiley, New York.
  • Miratrix, L. W., Sekhon, J. S. and Yu, B. (2012). Adjusting treatment effect estimates by post-stratification in randomized experiments. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 369–396.
  • Moher, D., Hopewell, S. and Schulz, K. F. et al. (2010). CONSORT 2010 explanation and elaboration: Updated guidelines for reporting parallel group randomised trials. BMJ 340 c869.
  • Neyman, J. (1923). On the application of probability theory to agricultural experiments. Essay on principles. Section 9. Ann. Agric. Sci. 101–151 (in Polish). [Reprinted in English with discussion by T. Speed and D. B. Rubin in Statist. Sci. 5 (1990) 463–480. MR1092986]
  • Raudenbush, S. W. (1997). Statistical analysis and optimal design for cluster randomized trials. Psychol. Meth. 2 173–185.
  • Reichardt, C. S. and Gollob, H. F. (1999). Justifying the use and increasing the power of a $t$ test for a randomized experiment with a convenience sample. Psychol. Meth. 4 117–128.
  • Rosenbaum, P. R. (2002). Covariance adjustment in randomized experiments and observational studies. Statist. Sci. 17 286–327.
  • Rosenbaum, P. R. (2010). Design of Observational Studies. Springer, New York.
  • Royall, R. M. and Cumberland, W. G. (1978). Variance estimation in finite population sampling. J. Amer. Statist. Assoc. 73 351–358.
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66 688–701.
  • Rubin, D. B. (1984). William G. Cochran’s contributions to the design, analysis, and evaluation of observational studies. In W. G. Cochran’s Impact on Statistics (P. S. R. S. Rao andJ. Sedransk, eds.) 37–69. Wiley, New York.
  • Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc. 100 322–331.
  • Rubin, D. B. and van der Laan, M. J. (2011). Targeted ANCOVA estimator in RCTs. In Targeted Learning: Causal Inference for Observational and Experimental Data (M. J. van der Laan and S. Rose, eds.) 201–215. Springer, New York.
  • Samii, C. and Aronow, P. M. (2012). On equivalencies between design-based and regression-based variance estimators for randomized experiments. Statist. Probab. Lett. 82 365–370.
  • Schochet, P. Z. (2010). Is regression adjustment supported by the Neyman model for causal inference? J. Statist. Plann. Inference 140 246–259.
  • Senn, S. J. (1989). Covariate imbalance and random allocation in clinical trials. Stat. Med. 8 467–475.
  • Stock, J. H. (2010). The other transformation in econometric practice: Robust tools for inference. J. Econ. Perspect. 24(2) 83–94.
  • Stonehouse, J. M. and Forrester, G. J. (1998). Robustness of the $t$ and $U$ tests under combined assumption violations. J. Appl. Stat. 25 63–74.
  • Tibshirani, R. (1986). Discussion of “Jackknife, bootstrap and other resampling methods in regression analysis” by C. F. J. Wu. Ann. Statist. 14 1335–1339. [Correction: (1988) 16 479.]
  • Tsiatis, A. A., Davidian, M., Zhang, M. and Lu, X. (2008). Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Stat. Med. 27 4658–4677.
  • Tukey, J. W. (1991). Use of many covariates in clinical trials. Internat. Statist. Rev. 59 123–137.
  • Tukey, J. W. (1993). Tightening the clinical trial. Contr. Clin. Trials 14 266–285.
  • Watson, D. J. (1937). The estimation of leaf area in field crops. J. Agr. Sci. 27 474–483.
  • Welch, B. L. (1949). Further note on Mrs. Aspin’s tables and on certain approximations to the tabled function. Biometrika 36 293–296.
  • White, H. (1980a). Using least squares to approximate unknown regression functions. Internat. Econom. Rev. 21 149–170.
  • White, H. (1980b). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48 817–838.
  • Wu, C. F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. Ann. Statist. 14 1261–1350.
  • Yang, L. and Tsiatis, A. A. (2001). Efficiency study of estimators for a treatment effect in a pretest-posttest trial. Amer. Statist. 55 314–321.

Supplemental materials