The Annals of Applied Statistics

Estimation and extrapolation of time trends in registry data—Borrowing strength from related populations

Andrea Riebler, Leonhard Held, and Håvard Rue

Full-text: Open access

Abstract

To analyze and project age-specific mortality or morbidity rates age-period-cohort (APC) models are very popular. Bayesian approaches facilitate estimation and improve predictions by assigning smoothing priors to age, period and cohort effects. Adjustments for overdispersion are straightforward using additional random effects. When rates are further stratified, for example, by countries, multivariate APC models can be used, where differences of stratum-specific effects are interpretable as log relative risks. Here, we incorporate correlated stratum-specific smoothing priors and correlated overdispersion parameters into the multivariate APC model, and use Markov chain Monte Carlo and integrated nested Laplace approximations for inference. Compared to a model without correlation, the new approach may lead to more precise relative risk estimates, as shown in an application to chronic obstructive pulmonary disease mortality in three regions of England and Wales. Furthermore, the imputation of missing data for one particular stratum may be improved, since the new approach takes advantage of the remaining strata if the corresponding observations are available there. This is shown in an application to female mortality in Denmark, Sweden and Norway from the 20th century, where we treat for each country in turn either the first or second half of the observations as missing and then impute the omitted data. The projections are compared to those obtained from a univariate APC model and an extended Lee–Carter demographic forecasting approach using the proper Dawid–Sebastiani scoring rule.

Article information

Source
Ann. Appl. Stat., Volume 6, Number 1 (2012), 304-333.

Dates
First available in Project Euclid: 6 March 2012

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1331043398

Digital Object Identifier
doi:10.1214/11-AOAS498

Mathematical Reviews number (MathSciNet)
MR2951539

Zentralblatt MATH identifier
1235.62030

Keywords
Bayesian analysis INLA multivariate age-period-cohort model projections uniform correlation matrix

Citation

Riebler, Andrea; Held, Leonhard; Rue, Håvard. Estimation and extrapolation of time trends in registry data—Borrowing strength from related populations. Ann. Appl. Stat. 6 (2012), no. 1, 304--333. doi:10.1214/11-AOAS498. https://projecteuclid.org/euclid.aoas/1331043398


Export citation

References

  • Andreasen, V., Viboud, C. and Simonsen, L. (2008). Epidemiologic characterization of the 1918 influenza pandemic summer wave in Copenhagen: Implications for pandemic control strategies. J. Infect. Dis. 197 270–278.
  • Armitage, P. (1966). The chi-square test for heterogeneity of proportions, after adjustment for stratification. J. Roy. Statist. Soc. Ser. B 28 150–163.
  • Baker, A. and Bray, I. (2005). Bayesian projections: What are the effects of excluding data from younger age groups? Am. J. Epidemiol. 162 798–805.
  • Berzuini, C. and Clayton, D. (1994). Bayesian analysis of survival on multiple time scales. Stat. Med. 13 823–838.
  • Besag, J., Green, P., Higdon, D. and Mengersen, K. (1995). Bayesian computation and stochastic systems. Statist. Sci. 10 3–66.
  • Biatat, V. D. and Currie, I. D. (2010). Joint models for classification and comparison of mortality in different countries. In 25th International Workshop on Statistical Modelling (A. W. Bowman, ed.) 89–94. Univ. Glasgow, UK.
  • Billingsley, P. (1986). Probability and Measure, 2nd ed. Wiley, New York.
  • Booth, H. (2006). Demographic forecasting: 1980 to 2005 in review. International Journal of Forecasting 22 547–581.
  • Booth, H., Maindonald, J. and Smith, L. (2002). Applying Lee–Carter under conditions of variable mortality decline. Popul. Stud. (Camb.) 56 325–336.
  • Booth, H., Hyndman, R. J., Tickle, L. and de Jong, P. (2006). Lee–Carter mortality forecasting: A multi-country comparison of variants and extensions. Demographic Research 15 289–310.
  • Bouchardy, C., Lutz, J.-M. and Kühni, C. (2011). Krebs in der Schweiz: Stand und Entwicklung von 1983 bis 2007. BFS, NICER, SKKR, Neuchâtel.
  • Bray, I. (2002). Application of Markov chain Monte Carlo methods to projecting cancer incidence and mortality. J. Roy. Statist. Soc. Ser. C 51 151–164.
  • Bray, I., Brennan, P. and Boffetta, P. (2001). Recent trends and future projections of lymphoid neoplasms—a Bayesian age-period-cohort analysis. Cancer Causes and Control 12 813–820.
  • Brillinger, D. R. (1986). The natural variability of vital rates and associated statistics. Biometrics 42 693–734.
  • Brouhns, N., Denuit, M. and Vermunt, J. K. (2002). A Poisson log-bilinear regression approach to the construction of projected lifetables. Insurance Math. Econom. 31 373–393.
  • Butt, Z. and Haberman, S. (2009). ilc: A collection of R functions for fitting a class of Lee–Carter mortality models using iterative fitting algorithms. Technical report, Actuarial Research Paper No. 190, City Univ. London, UK.
  • Byers, S. and Besag, J. (2000). Inference on a collapsed margin in disease mapping. Stat. Med. 19 2243–2249.
  • Carlin, B. P. and Banerjee, S. (2003). Hierarchical multivariate CAR models for spatio-temporally correlated survival data. In Bayesian Statistics, 7 (Tenerife, 2002) (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman and A. F. M. Smith, eds.) 45–63. Oxford Univ. Press, New York.
  • Clayton, D. and Schifflers, E. (1987). Models for temporal variation in cancer rates. II: Age-period-cohort models. Stat. Med. 6 469–481.
  • Currie, I. D., Durban, M. and Eilers, P. H. C. (2004). Smoothing and forecasting mortality rates. Stat. Model. 4 279–298.
  • Czado, C., Gneiting, T. and Held, L. (2009). Predictive model assessment for count data. Biometrics 65 1254–1261.
  • Dockery, D. W. and Pope, C. A. (1994). Acute respiratory effects of particulate air pollution. Annu. Rev. Public Health 15 107–132.
  • Ess, S., Savidan, A., Frick, H., Rageth, C., Vlastos, G., Lütolf, U. and Thürlimann, B. (2010). Geographic variation in breast cancer care in Switzerland. Cancer Epidemiol. 34 116–121.
  • Fahrmeir, L. and Tutz, G. (2001). Multivariate Statistical Modelling Based on Generalized Linear Models, 2nd ed. Springer, New York.
  • Fienberg, S. E. and Mason, W. M. (1979). Identification and estimation of age-period-cohort models in the analysis of discrete archival data. Sociological Methodology 10 1–67.
  • Fisher, R. A. (1958). Statistical Methods for Research Workers, 13th (rev.) ed. Oliver and Boyd, Edinburgh.
  • Fu, W. J. J. (2000). Ridge estimator in singular design with application to age-period-cohort analysis of disease rates. Comm. Statist. Theory Methods 29 263–278.
  • Gelfand, A. E. and Ghosh, S. K. (1998). Model choice: A minimum posterior predictive loss approach. Biometrika 85 1–11.
  • Gelfand, A. E. and Vounatsou, P. (2003). Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics 4 11–25.
  • Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359–378.
  • Greco, F. P. and Trivisano, C. (2009). A multivariate CAR model for improving the estimation of relative risks. Stat. Med. 28 1707–1724.
  • Hansell, A. L. (2004). The epidemiology of chronic obstructive pulmonary disease in the UK: spatial and temporal variations. Ph.D. thesis, Faculty of Medicine, Univ. London, Imperial College, St Mary’s Campus.
  • Hansell, A., Knorr-Held, L., Best, N., Schmid, V. and Aylin, P. (2003). COPD mortality trends 1950–1999 in England and Wales—Did the 1956 Clean Air Act make a detectable difference? Epidemiology 14 S55.
  • Harvey, A. (1990). Forecasting, Structural Time Series Models and the Kalman Filter, Reprinted ed. Cambridge Univ. Press, Cambridge.
  • Held, L. and Riebler, A. (2011). A conditional approach for inference in multivariate age-period-cohort models. Stat. Methods Med. Res. To appear. DOI:10.1177/0962280210379761.
  • Heuer, C. (1997). Modeling of time trends and interactions in vital rates using restricted regression splines. Biometrics 53 161–177.
  • Holford, T. R. (1983). The estimation of age, period and cohort effects for vital rates. Biometrics 39 311–324.
  • Holford, T. R. (1992). Analysing the temporal effects of age, period and cohort. Stat. Methods Med. Res. 1 317–337.
  • Holford, T. R. (2006). Approaches to fitting age-period-cohort models with unequal intervals. Stat. Med. 25 977–993.
  • Human Mortality Database (2011). Univ. California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Available at www.mortality.org or www.humanmortality.de.
  • Jacobsen, R., Von Euler, M., Osler, M., Lynge, E. and Keiding, N. (2004). Women’s death in Scandinavia—what makes Denmark different? European Journal of Epidemiology 19 117–121.
  • Kazerouni, N., Alverson, C. J., Redd, S. C., Mott, J. A. and Mannino, D. M. (2004). Sex differences in COPD and lung cancer mortality trends—United States, 1968–1999. J. Women’s Health 13 17–23.
  • Knorr-Held, L. (2000). Bayesian modelling of inseparable space–time variation in disease risk. Stat. Med. 19 2555–2567.
  • Knorr-Held, L. and Rainer, E. (2001). Projections of lung cancer mortality in West Germany: A case study in Bayesian prediction. Biostatistics 2 109–129.
  • Kolte, I. V., Skinhøj, P., Keiding, N. and Lynge, E. (2008). The Spanish flu in Denmark. Scand. J. Infect. Dis. 40 538–546.
  • Konishi, S. (1985). Normalizing and variance stabilizing transformations for intraclass correlations. Ann. Inst. Statist. Math. 37 87–94.
  • Kuang, D., Nielsen, B. and Nielsen, J. P. (2008). Identification of the age-period-cohort model and the extended chain-ladder model. Biometrika 95 979–986.
  • Lagazio, C., Biggeri, A. and Dreassi, E. (2003). Age-period-cohort models and disease mapping. Environmetrics 14 475–490.
  • Lee, R. D. and Carter, L. R. (1992). Modeling and forecasting U.S. mortality. J. Amer. Statist. Assoc. 87 659–671.
  • Lehmann, E. L. (1999). Elements of Large-Sample Theory. Springer, New York.
  • Levi, F., Randimbison, L., Te, V. C., Rolland-Portal, I., Franceschi, S. and La Vecchia, C. (1993). Multiple primary cancers in the Vaud cancer registry, Switzerland, 1974-89. Br. J. Cancer 67 391–395.
  • Levi, F., La Vecchia, C., Randimbison, L., Erler, G., Te, V. C. and Franceschi, S. (1998). Incidence, mortality and survival from prostate cancer in Vaud and Neuchâtel, Switzerland, 1974–1994. Ann. Oncol. 9 31–35.
  • Levi, F., Randimbison, L., Te, V.-C. and La Vecchia, C. (2002). Thyroid cancer in Vaud, Switzerland: An update. Thyroid 12 163–168.
  • Li, N. and Lee, R. (2005). Coherent mortality forecasts for a group of populations: An extension of the Lee–Carter method. Demography 42 575–594.
  • Lindley, D. (1965). Introduction to Probability and Statistics from a Bayesian Viewpoint, Part 2, Inference. Cambridge Univ. Press, Cambridge.
  • Mardia, K. (1988). Multi-dimensional multivariate Gaussian Markov random fields with application to image processing. J. Multivariate Anal. 24 265–284.
  • Nakamura, T. (1986). Bayesian cohort models for general cohort table analyses. Ann. Inst. Statist. Math. 38 353–370.
  • Ogata, Y., Katsura, K., Keiding, N., Holst, C. and Green, A. (2000). Empirical Bayes age-period-cohort analysis of retrospective incidence data. Scand. J. Stat. 27 415–432.
  • Osmond, C. and Gardner, M. J. (1982). Age, period and cohort models applied to cancer mortality rates. Stat. Med. 1 245–259.
  • Paul, M., Riebler, A., Bachmann, L. M., Rue, H. and Held, L. (2010). Bayesian bivariate meta-analysis of diagnostic test studies using integrated nested Laplace approximations. Stat. Med. 29 1325–1339.
  • R Development Core Team (2010). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • Riebler, A. and Held, L. (2010). The analysis of heterogeneous time trends in multivariate age-period-cohort models. Biostatistics 11 57–69.
  • Riebler, A., Held, L. and Rue, H. (2011). Supplement to “Estimation and extrapolation of time trends in registry data—Borrowing strength from related populations.” DOI:10.1214/11-AOAS498SUPP.
  • Riebler, A., Held, L., Rue, H. and Bopp, M. (2011). Gender-specific differences and the impact of family integration on time trends in age-stratified Swiss suicide rates. J. Roy. Statist. Soc. Ser. A. To appear.
  • Robertson, C. and Boyle, P. (1986). Age, period and cohort models: The use of individual records. Stat. Med. 5 527–538.
  • Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability 104. Chapman and Hall/CRC, Boca Raton, FL.
  • Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 319–392.
  • Schmid, V. and Held, L. (2004). Bayesian extrapolation of space-time trends in cancer registry data. Biometrics 60 1034–1042.
  • Schmid, V. J. and Held, L. (2007). Bayesian age-period-cohort modeling and prediction—BAMP. Journal of Statistical Software 21 1–15.
  • Schrödle, B., Held, L., Riebler, A. and Danuser, J. (2011). Using integrated nested Laplace approximations for the evaluation of veterinary surveillance data from Switzerland: A case study. J. R. Stat. Soc. Ser. C. Appl. Stat. 60 261–279.
  • Sunyer, J. (2001). Urban air pollution and chronic obstructive pulmonary disease: A review. Eur. Respir. J. 17 1024–1033.
  • Verkooijhen, H. M., Fioretta, G., Vlastos, G., Morabia, A., Schubert, H., Sappino, A., Pelte, M., Schafer, P., Kurtz, J. and Bouchardy, C. (2003). Important increase of invasive lobular breast cancer incidence in Geneva, Switzerland. International Journal of Cancer 104 778–781.
  • Yang, Y., Fu, W. J. and Land, K. C. (2004). A methodological comparison of age-period-cohort models: The intrinsic estimator and conventional generalized linear models. Sociological Methodology 34 75–110.

Supplemental materials

  • Supplementary material: Code repository for the cross-prediction study of overall mortality of Scandinavian women. This repository archives the data, R-code and results for the cross-prediction study of overall mortality of Scandinavian women presented in Section 4.2. In particular, it contains code to make Table 1 and Figures 5–11.