Electronic Journal of Statistics

Monte Carlo modified profile likelihood in models for clustered data

Claudia Di Caterina, Giuliana Cortese, and Nicola Sartori

Full-text: Open access


The main focus of the analysts who deal with clustered data is usually not on the clustering variables, and hence the group-specific parameters are treated as nuisance. If a fixed effects formulation is preferred and the total number of clusters is large relative to the single-group sizes, classical frequentist techniques relying on the profile likelihood are often misleading. The use of alternative tools, such as modifications to the profile likelihood or integrated likelihoods, for making accurate inference on a parameter of interest can be complicated by the presence of nonstandard modelling and/or sampling assumptions. We show here how to employ Monte Carlo simulation in order to approximate the modified profile likelihood in some of these unconventional frameworks. The proposed solution is widely applicable and is shown to retain the usual properties of the modified profile likelihood. The approach is examined in two instances particularly relevant in applications, i.e. missing-data models and survival models with unspecified censoring distribution. The effectiveness of the proposed solution is validated via simulation studies and two clinical trial applications.

Article information

Electron. J. Statist., Volume 13, Number 1 (2019), 432-464.

Received: December 2017
First available in Project Euclid: 12 February 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G20: Asymptotic properties asymptotic properties

Censored data nonignorable missing data nuisance parameter profile likelihood two-index asymptotics

Creative Commons Attribution 4.0 International License.


Di Caterina, Claudia; Cortese, Giuliana; Sartori, Nicola. Monte Carlo modified profile likelihood in models for clustered data. Electron. J. Statist. 13 (2019), no. 1, 432--464. doi:10.1214/19-EJS1532. https://projecteuclid.org/euclid.ejs/1549962031

Export citation


  • [1] Agresti, A. (2015)., Foundations of linear and generalized linear models. John Wiley & Sons.
  • [2] Baker, S. G. (1995). Marginal regression for repeated binary data with outcome subject to non-ignorable non-response., Biometrics 51, 1042–1052.
  • [3] Barndorff-Nielsen, O. E. (1980). Conditionality resolutions., Biometrika 67, 293–310.
  • [4] Barndorff-Nielsen, O. E. (1983). On a formula for the distribution of the maximum likelihood estimator., Biometrika 70, 343–365.
  • [5] Bartolucci, F., R. Bellio, A. Salvan, and N. Sartori (2016). Modified profile likelihood for fixed-effects panel data models., Econometric Reviews 35, 1271–1289.
  • [6] Bellio, R. and N. Sartori (2003). Extending conditional likelihood in models for stratified binary data., Statistical Methods and Applications 12, 121–132.
  • [7] Bellio, R. and N. Sartori (2006). Practical use of modified maximum likelihoods for stratified data., Biometrical journal 48, 876–886.
  • [8] Bellio, R. and N. Sartori (2015)., panelMPL: Modified profile likelihood estimation for fixed-effects panel data models. http://ruggerobellio.weebly.com/software.html.
  • [9] Carey, V. J., T. Lumley, and B. Ripley. (2015)., gee: Generalized Estimation Equation Solver. R package version 4.13-19.
  • [10] Carlin, B. and J. Hodges (1999). Hierarchical proportional hazards regression models for highly stratified data., Biometrics 55, 1162–1170.
  • [11] Cortese, G. and N. Sartori (2016). Integrated likelihoods in parametric survival models for highly clustered censored data., Lifetime Data Analysis 22, 382–404.
  • [12] Davison, A. C. and D. V. Hinkley (1997)., Bootstrap Methods and their Application. Cambridge University Press.
  • [13] De Bin, R., N. Sartori, and T. Severini (2015). Integrated likelihoods in models with stratum nuisance parameters., Electronic Journal of Statistics 9, 1474–1491.
  • [14] Dempster, A. P., N. M. Laird, and D. B. Rubin (1977). Maximum likelihood from incomplete data via the EM algorithm., Journal of the Royal Statistical Society. Series B (Methodological) 39, 1–38.
  • [15] Diciccio, T. J., M. A. Martin, S. E. Stern, and G. A. Young (1996). Information bias and adjusted profile likelihoods., Journal of the Royal Statistical Society. Series B (Methodological 58, 189–203.
  • [16] Fitzmaurice, G., M. Davidian, G. Verbeke, and G. Molenberghs (2008)., Longitudinal Data Analysis. Chapman & Hall/CRC.
  • [17] He, H. and T. Severini (2014). Integrated likelihood inference in semiparametric regression models., METRON - International Journal of Statistics 72, 185–199.
  • [18] Ibrahim, J. G., S. R. Lipsitz, and N. Horton (2001). Using auxiliary data for parameter estimation with non-ignorably missing outcomes., Journal of the Royal Statistical Society. Series C (Applied Statistics) 50, 361–373.
  • [19] Kenward, M. G. and G. Molenberghs (1998). Likelihood based frequentist inference when data are missing at random., Statistical Science 13, 236–247.
  • [20] Lancaster, T. (2000). The incidental parameter problem since 1948., Journal of Econometrics 95, 391–413.
  • [21] Lee, Y. and J. Nelder (2004). Conditional and Marginal Models: Another View., Statistical Science 19, 219–238.
  • [22] Liang, K.-Y. and S. L. Zeger (1986). Longitudinal data analysis using generalized linear models., Biometrika 73(1), 13–22.
  • [23] Little, R. J., D. B. Rubin, and S. Z. Zangeneh (2017). Conditions for ignoring the missing-data mechanism in likelihood inferences for parameter subsets., Journal of the American Statistical Association 112, 314–320.
  • [24] Little, R. J. A. and D. B. Rubin (2002)., Statistical Analysis with Missing Data (2nd ed.). Wiley, New York.
  • [25] McCullagh, P. and R. Tibshirani (1990). A simple method for the adjustment of profile likelihoods., Journal of the Royal Statistical Society. Series B (Methodological) 52, 325–344.
  • [26] Molenberghs, G. and G. Verbeke (2005)., Models for Discrete Longitudinal Data. Springer, New York.
  • [27] Nelder, J. A. and R. Mead (1965). A simplex method for function minimization., The Computer Journal 7, 308–313.
  • [28] Neyman, J. and E. Scott (1948, January). Consistent estimates based on partially consistent observations., Econometrica 16, 1–32.
  • [29] Pace, L. and A. Salvan (1997)., Principles of Statistical Inference from a Neo-Fisherian Perspective. World Scientific Publishing, Singapore.
  • [30] Parzen, M., S. R. Lipsitz, G. M. Fitzmaurice, J. G. Ibrahim, and A. Troxel (2006). Pseudo-likelihood methods for longitudinal binary data with non-ignorable missing responses and covariates., Statistics in Medicine 25, 2784–2796.
  • [31] Pierce, D. A. and R. Bellio (2006). Effects of the reference set on frequentist inferences., Biometrika 93, 425–438.
  • [32] Pierce, D. A. and R. Bellio (2015). Beyond first-order asymptotics for Cox regression., Bernoulli 21, 401–419.
  • [33] R Core Team (2017)., R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
  • [34] Rubin, D. B. (1976). Inference and missing data., Biometrika 63, 581–592.
  • [35] Sartori, N. (2003). Modified profile likelihoods in models with stratum nuisance parameters., Biometrika 90, 533–549.
  • [36] Severini, T. A. (1998). An approximation to the modified profile likelihood function., Biometrika 85, 403–411.
  • [37] Severini, T. A. (2000)., Likelihood Methods in Statistics. Oxford University Press.
  • [38] Severini, T. A. (2007). Integrated likelihood functions for non-Bayesian inference., Biometrika 94, 529–542.
  • [39] Sinha, S. K., A. B. Troxel, S. R. Lipsitz, D. Sinha, G. M. Fitzmaurice, G. Molenberghs, and J. G. Ibrahim (2011). A bivariate pseudolikelihood for incomplete longitudinal binary data with nonignorable nonmonotone missingness., Biometrics 67, 1119–1126.
  • [40] Therneau, T. M. (2015)., survival: A Package for Survival Analysis in S. R package version 2.38.
  • [41] Troxel, A. B., D. P. Harrington, and S. R. Lipsitz (1998). Analysis of longitudinal data with non-ignorable non-monotone missing values., Journal of the Royal Statistical Society. Series C (Applied Statistics) 47, 425–438.
  • [42] Troxel, A. B., S. R. Lipsitz, and D. P. Harrington (1998). Marginal models for the analysis of longitudinal measurements with nonignorable non-monotone missing data., Biometrika 85, 661–672.