The Annals of Applied Statistics

Are private schools better than public schools? Appraisal for Ireland by methods for observational studies

Danny Pfeffermann and Victoria Landsman

Full-text: Open access


In observational studies the assignment of units to treatments is not under control. Consequently, the estimation and comparison of treatment effects based on the empirical distribution of the responses can be biased since the units exposed to the various treatments could differ in important unknown pretreatment characteristics, which are related to the response. An important example studied in this article is the question of whether private schools offer better quality of education than public schools. In order to address this question, we use data collected in the year 2000 by OECD for the Programme for International Student Assessment (PISA). Focusing for illustration on scores in mathematics of 15-year-old pupils in Ireland, we find that the raw average score of pupils in private schools is higher than of pupils in public schools. However, application of a newly proposed method for observational studies suggests that the less able pupils tend to enroll in public schools, such that their lower scores are not necessarily an indication of bad quality of the public schools. Indeed, when comparing the average score in the two types of schools after adjusting for the enrollment effects, we find quite surprisingly that public schools perform better on average. This outcome is supported by the methods of instrumental variables and latent variables, commonly used by econometricians for analyzing and evaluating social programs.

Article information

Ann. Appl. Stat., Volume 5, Number 3 (2011), 1726-1751.

First available in Project Euclid: 13 October 2011

Permanent link to this document

Digital Object Identifier

Zentralblatt MATH identifier

Average treatment effect goodness of fit identifiability instrumental variables private-dependent schools propensity scores sample distribution


Pfeffermann, Danny; Landsman, Victoria. Are private schools better than public schools? Appraisal for Ireland by methods for observational studies. Ann. Appl. Stat. 5 (2011), no. 3, 1726--1751. doi:10.1214/11-AOAS456.

Export citation


  • Abadie, A. and Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica 74 235–267.
  • Adams, R. and Wu, M., eds. (2002). PISA 2000 Technical report, OECD, Paris.
  • Azzalini, A. and Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. R. Stat. Soc. Ser. B Stat. Methodol. 65 367–389.
  • Babu, G. J. and Feigelson, E. D. (2006). Astrostatistics: Goodness-of-fit and all that! In Astronomical Data Analysis Software and Systems XV, ASP Conference Series (C. Gabriel, C. Arviset, D. Ponz and E. Solano, eds.) 351 127–136. Astronomical Society of the Pacific, San Francisco.
  • Babu, G. J. and Rao, C. R. (2004). Goodness-of-fit tests when parameters are estimated. Sankhyā 66 63–74.
  • Brewer, K. R. W. (1963). Ratio estimation and finite populations: Some results deducible from the assumption of an underlying stochastic process. Austral. J. Statist. 5 93–105.
  • Dronkers, J. and Avram, S. (2010). A cross-national analysis of the relations of school choice and effectiveness differences between private-dependent and public schools. Educational Research and Evaluation 16 151–175.
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • Greenlees, J. S., Reece, W. S. and Zieschang, K. D. (1982). Imputation of missing values when the probability of response depends on the variable being imputed. J. Amer. Statist. Assoc. 77 251–261.
  • Hajek, J. (1971). Comment on “An essay on the logical foundations of survey sampling, part one”. In The Foundations of Survey Sampling (V. P. Godambe and D. A. Sprott, eds.) 236. Holt, Rinehart and Winston, Toronto, ON.
  • Hanushek, E. (2002). Publicly provided education. In Handbook of Public Economics (A. J. Auerbach and M. Feldstein, eds.) 2045–2141. North-Holland, Amsterdam.
  • Heckman, J. and Vytlacil, E. (2006). Econometric evaluation of social programs. In Handbook of Econometrics 6B (J. Heckman and E. Leamer, eds.) 4810–4861. North-Holland, Amsterdam.
  • Hoxby, C. M. (2000). Does competition among public schools benefit students and taxpayers? The American Economic Review 90 1209–1238.
  • Imbens, G. and Angrist, J. (1994). Identification and estimation of local average treatment effects. Econometrics 62 467–475.
  • Landsman, V. (2008). Estimation of treatment effects in observational studies by recovering the assignment probabilities and the population model. Ph.D. dissertation, Hebrew Univ. Jerusalem, Israel.
  • Lee, J. and Berger, J. O. (2001). Semiparametric Bayesian analysis of selection models. J. Amer. Statist. Assoc. 96 1397–1409.
  • Little, R. J. (2004). To model or not to model? Competing modes of inference for finite population sampling. J. Amer. Statist. Assoc. 99 546–556.
  • Lunceford, J. K. and Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Stat. Med. 23 2937–2960.
  • Maddala, G. S. (1983). Limited-Dependent and Qualitative Variables in Econometrics. Econometric Society Monographs in Quantitative Economics 3. Cambridge Univ. Press, Cambridge.
  • McCaffrey, D., Ridgeway, G. and Morral, A. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods 9 403–425.
  • Pfeffermann, D. and Landsman, V. (2011). Supplement to “Are private schools better than public schools? Appraisal for Ireland by methods for observational studies.” DOI:10.1214/11-AOAS456SUPP.
  • Pfeffermann, D. and Sverchkov, M. Y. (2003). Fitting generalized linear models under informative sampling. In Analysis of Survey Data (Southampton, 1999) 175–195. Wiley, Chichester.
  • Pfeffermann, D. and Sverchkov, M. (2009). Inference under informative sampling. In Sample Surveys: Inference and Analysis. Handbook of Statistics 29B (D. Pfeffermann and C. R. Rao, eds.) 455–487. North-Holland, Amsterdam.
  • Qin, J. and Zhang, B. (2007). Empirical-likelihood-based inference in missing response problems and its application in observational studies. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 101–122.
  • R Development Core Team (2004). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer, New York.
  • Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Rotnitzky, A. and Robins, J. (1997). Analysis of semi-parametric regression models with non-ignorable non-response. Stat. Med. 16 81–102.
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educational Psychology 66 688–701.
  • Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
  • Smith, T. M. F. and Sugden, R. A. (1988). Sampling and assignment mechanisms in experiments, surveys and observational studies. Internat. Statist. Rev. 56 165–180.
  • StataCorp (2004). Stata Statistical Software: Release 7. StataCorp LP, College Station, TX.
  • Stephens, M. A. (1986). Tests based on EDF statistics. In Goodness-of-Fit Techniques (R. B. D’Agostino and M. A. Stephens, eds.) 97–193. Dekker, New York.
  • Vandenberghe, V. and Robin, S. (2004). Evaluating the effectiveness of private education across countries: A comparison of methods. Labour Economics 11 487–506.
  • Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, MA.

Supplemental materials

  • Supplementary material: Supplement to: “Are private schools better than public schools? Appraisal for Ireland by methods for observational studies”. This supplement contains a PDF which is divided into five sections: Supplement A develops the probability weighted estimators of the ATE. Supplement B describes the maximization of the likelihood (4.3). Supplement C contains the proof of Lemma 1. Supplement D contains the proof of Result 1. Supplement E describes the data file, which is provided. The data file PISA_math2000.R contains the data.