Open Access
2018 Effective sample size for spatial regression models
Jonathan Acosta, Ronny Vallejos
Electron. J. Statist. 12(2): 3147-3180 (2018). DOI: 10.1214/18-EJS1460
Abstract

We propose a new definition of effective sample size. Although the recent works of Griffith (2005, 2008) and Vallejos and Osorio (2014) provide a theoretical framework to address the reduction of information in a spatial sample due to spatial autocorrelation, the asymptotic properties of the estimations have not been studied in those studies or in previously ones. In addition, the concept of effective sample size has been developed primarily for spatial regression processes with a constant mean. This paper introduces a new definition of effective sample size for general spatial regression models that is coherent with previous definitions. The asymptotic normality of the maximum likelihood estimation is obtained under an increasing domain framework. In particular, the conditions for which the limiting distribution holds are established for the Matérn covariance family. Illustrative examples accompany the discussion of the limiting results, including some cases where the asymptotic variance has a closed form. The asymptotic normality leads to an approximate hypothesis testing that establishes whether there is redundant information in the sample. Simulation results support the theoretical findings and provide information about the behavior of the power of the suggested test. A real dataset in which a transect sampling scheme has been used is analyzed to estimate the effective sample size when a spatial linear regression model is assumed.

References

1.

ACOSTA, J., OSORIO, F. and VALLEJOS, R. (2016). Effective sample size for line transect sampling models with an application to marine macroalgae., Journal of Agricultural, Biological and Environmental Statistics, 21, 407–425. 1347.62236 10.1007/s13253-016-0252-7ACOSTA, J., OSORIO, F. and VALLEJOS, R. (2016). Effective sample size for line transect sampling models with an application to marine macroalgae., Journal of Agricultural, Biological and Environmental Statistics, 21, 407–425. 1347.62236 10.1007/s13253-016-0252-7

2.

BANERJEE, S., CARLIN B. and GELFAND A. (2004), Hierarchical Modeling and Analysis for Spatial Data, Boca Raton: Chapman Hall/CRC. 1053.62105BANERJEE, S., CARLIN B. and GELFAND A. (2004), Hierarchical Modeling and Analysis for Spatial Data, Boca Raton: Chapman Hall/CRC. 1053.62105

3.

BERGER, J., BAYARRI, M. J. and PERICCHI, L. R. (2014). The effective sample size., Econometric Letters 33 197–2014. MR3170846 10.1080/07474938.2013.807157BERGER, J., BAYARRI, M. J. and PERICCHI, L. R. (2014). The effective sample size., Econometric Letters 33 197–2014. MR3170846 10.1080/07474938.2013.807157

4.

BOX, G.E.P. (1954a). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one way classification., Ann. Math. Statist., 25, 290–302. 0055.37305 10.1214/aoms/1177728786 euclid.aoms/1177728786BOX, G.E.P. (1954a). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one way classification., Ann. Math. Statist., 25, 290–302. 0055.37305 10.1214/aoms/1177728786 euclid.aoms/1177728786

5.

BOX, G.E.P. (1954b). Some theorems on quadratic forms applied in the study of analysis of variance problems: II. Effects of inequality of variance and of correlation between errors in the two way classification., Ann. Math. Statist., 25, 484–498. 0056.36604 10.1214/aoms/1177728717 euclid.aoms/1177728717BOX, G.E.P. (1954b). Some theorems on quadratic forms applied in the study of analysis of variance problems: II. Effects of inequality of variance and of correlation between errors in the two way classification., Ann. Math. Statist., 25, 484–498. 0056.36604 10.1214/aoms/1177728717 euclid.aoms/1177728717

6.

BOX, G. E. P. and Cox, D. R. (1964). An analysis of transformations., Journal of the Royal Statistical Society. Series B, 26, 211–252. 0156.40104 10.1111/j.2517-6161.1964.tb00553.xBOX, G. E. P. and Cox, D. R. (1964). An analysis of transformations., Journal of the Royal Statistical Society. Series B, 26, 211–252. 0156.40104 10.1111/j.2517-6161.1964.tb00553.x

7.

CLIFFORD, P., RICHARDSON, S. and HÉMON, D. (1989). Assessing the significance of the correlation between two spatial processes., Biometrics, 45, 123–134.CLIFFORD, P., RICHARDSON, S. and HÉMON, D. (1989). Assessing the significance of the correlation between two spatial processes., Biometrics, 45, 123–134.

8.

COGLEY, J. G. (1999). Effective sample size for glacier mass balance., Geogr. Ann. A 81 497–507.COGLEY, J. G. (1999). Effective sample size for glacier mass balance., Geogr. Ann. A 81 497–507.

9.

CRESSIE, N. (1993)., Statistics for Spatial Data. New York: Wiley. 1347.62005CRESSIE, N. (1993)., Statistics for Spatial Data. New York: Wiley. 1347.62005

10.

CRESSIE, N. AND LAHIRI, S. N. (1993). Asymptotic distribution of REML estimators., Journal of Multivariate Analysis, 45, 217–233. 0772.62008 10.1006/jmva.1993.1034CRESSIE, N. AND LAHIRI, S. N. (1993). Asymptotic distribution of REML estimators., Journal of Multivariate Analysis, 45, 217–233. 0772.62008 10.1006/jmva.1993.1034

11.

CRUJEIRAS, R. M. and VAN KEILEGOM, I. (2010). Least squares estimation of nonlinear spatial trends., Computational Statistics and Data Analysis, 54, 452–465. 05689602 10.1016/j.csda.2009.09.014CRUJEIRAS, R. M. and VAN KEILEGOM, I. (2010). Least squares estimation of nonlinear spatial trends., Computational Statistics and Data Analysis, 54, 452–465. 05689602 10.1016/j.csda.2009.09.014

12.

DALE, M. R. T. and FORTIN, M-J. (2009). Spatial autocorrelation and statistical tests: some solutions., Journal of Agricultural, Biological, and Environmental Statistics, 14, 188–206. 1306.62263 10.1198/jabes.2009.0012DALE, M. R. T. and FORTIN, M-J. (2009). Spatial autocorrelation and statistical tests: some solutions., Journal of Agricultural, Biological, and Environmental Statistics, 14, 188–206. 1306.62263 10.1198/jabes.2009.0012

13.

de GRUIJTER, J. J. and ter BRAAK, C. J. F. (1990). Model-free estimation from spatial samples: A reappraisal of classical sampling theory., Mathematical Geology, 22, 407–415. 0970.86520 10.1007/BF00890327de GRUIJTER, J. J. and ter BRAAK, C. J. F. (1990). Model-free estimation from spatial samples: A reappraisal of classical sampling theory., Mathematical Geology, 22, 407–415. 0970.86520 10.1007/BF00890327

14.

DUTILLEUL, P. (1993). Modifying the $t$ test for assessing the correlation between two spatial processes., Biometrics, 49, 305–314.DUTILLEUL, P. (1993). Modifying the $t$ test for assessing the correlation between two spatial processes., Biometrics, 49, 305–314.

15.

DUTILLEUL, P., PELLETIER, B. and ALPARGU, G. (2008). Modified $F$ tests for assessing the multiple correlation between one spatial process and several others., Journal of Statistical Planning and Inference, 138, 1402–1415. 1133.62076 10.1016/j.jspi.2007.06.022DUTILLEUL, P., PELLETIER, B. and ALPARGU, G. (2008). Modified $F$ tests for assessing the multiple correlation between one spatial process and several others., Journal of Statistical Planning and Inference, 138, 1402–1415. 1133.62076 10.1016/j.jspi.2007.06.022

16.

FAES, C., MOLENBERGHS, G., AERTS, M., VERBEKE, G. and KENWARD, M. (2009). The effective sample size and an alternative small-sample degrees-of-freedom method., The American Statistician 63 389–399. 1182.62098 10.1198/tast.2009.08196FAES, C., MOLENBERGHS, G., AERTS, M., VERBEKE, G. and KENWARD, M. (2009). The effective sample size and an alternative small-sample degrees-of-freedom method., The American Statistician 63 389–399. 1182.62098 10.1198/tast.2009.08196

17.

GELFAND A. and VOUATSOU, P. (2003). Peoper multivariate conditional autoregressive models for spatial data analysis., Biostatistics, 4, 11–25.GELFAND A. and VOUATSOU, P. (2003). Peoper multivariate conditional autoregressive models for spatial data analysis., Biostatistics, 4, 11–25.

18.

GELFAND, A. E., DIGGLE, P. J., FUENTES, M. and GUTTORP, P. (2010)., Handbook of Spatial Statistics, Boca Raton, Fl: Chapman & Hall/CRC. 1188.62284GELFAND, A. E., DIGGLE, P. J., FUENTES, M. and GUTTORP, P. (2010)., Handbook of Spatial Statistics, Boca Raton, Fl: Chapman & Hall/CRC. 1188.62284

19.

GOLUB, G. H. and PEREYRA, V. (1973). The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate., SIAM Journal of Numerical Analysis, 22, 413–431. 0258.65045 10.1137/0710036GOLUB, G. H. and PEREYRA, V. (1973). The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate., SIAM Journal of Numerical Analysis, 22, 413–431. 0258.65045 10.1137/0710036

20.

GRIFFITH, D. (2005). Effective geographic sample size in the presence of spatial autocorrelation., Ann. Assoc. Amer. Geogr. 95 740–760.GRIFFITH, D. (2005). Effective geographic sample size in the presence of spatial autocorrelation., Ann. Assoc. Amer. Geogr. 95 740–760.

21.

GRIFFITH, D. (2008). Geographic sampling of urban soils for contaminant mapping: how many samples and from where., Environ. Geochem. Health. 30 495–509.GRIFFITH, D. (2008). Geographic sampling of urban soils for contaminant mapping: how many samples and from where., Environ. Geochem. Health. 30 495–509.

22.

GRIFFITH, D. and PAELINCK, J. H. P. (2011)., Non-Stardard Spatial Statistics. New York: Springer.GRIFFITH, D. and PAELINCK, J. H. P. (2011)., Non-Stardard Spatial Statistics. New York: Springer.

23.

Haining, R. (1990), Spatial Data Analysis in the Social Environmental Sciences. Cambridge: Cambridge University Press.Haining, R. (1990), Spatial Data Analysis in the Social Environmental Sciences. Cambridge: Cambridge University Press.

24.

KUTNER, M., NACHTSHEIM, C., NETER, J. and LI, W. (2004), Applied Linear Statistical Models, Homewood, IL: McGraw-Hill/Irwin.KUTNER, M., NACHTSHEIM, C., NETER, J. and LI, W. (2004), Applied Linear Statistical Models, Homewood, IL: McGraw-Hill/Irwin.

25.

KYUNG, M. and GHOSH, S. K. (2010). Maximum likelihood estimation for directional conditionally autoregressive models., Journal of Statistical Planning and Inference 140 3160–3179. 1204.62164 10.1016/j.jspi.2010.04.012KYUNG, M. and GHOSH, S. K. (2010). Maximum likelihood estimation for directional conditionally autoregressive models., Journal of Statistical Planning and Inference 140 3160–3179. 1204.62164 10.1016/j.jspi.2010.04.012

26.

LEDOIT, O. and WOLF, M. (2002). Some Hypothesis Tests for the covariance matrix when the dimension is large compared to the sample size., The Annals of Statistics, 30, 1081–1102. 1029.62049 10.1214/aos/1031689018 euclid.aos/1031689018LEDOIT, O. and WOLF, M. (2002). Some Hypothesis Tests for the covariance matrix when the dimension is large compared to the sample size., The Annals of Statistics, 30, 1081–1102. 1029.62049 10.1214/aos/1031689018 euclid.aos/1031689018

27.

LENTH, R. V. (2001). Some practical guidelines for effective sample size determination., The American Statistician, 55, 187–193.LENTH, R. V. (2001). Some practical guidelines for effective sample size determination., The American Statistician, 55, 187–193.

28.

MARDIA, K. and MARSHALL, R. (1984). Maximum likelihood of models for residual covariance in spatial regression., Biometrika 71 135–146. 0542.62079 10.1093/biomet/71.1.135MARDIA, K. and MARSHALL, R. (1984). Maximum likelihood of models for residual covariance in spatial regression., Biometrika 71 135–146. 0542.62079 10.1093/biomet/71.1.135

29.

RICHARDSON, S. (1990). Some remarks on the testing of association between spatial processes. In: Griffith, D. (Ed.), Spatial Statistics: Past, Present, and Future., Institute of Mathematical Geography, Ann Arbor, MI, 277?309.RICHARDSON, S. (1990). Some remarks on the testing of association between spatial processes. In: Griffith, D. (Ed.), Spatial Statistics: Past, Present, and Future., Institute of Mathematical Geography, Ann Arbor, MI, 277?309.

30.

SCHABENBERGER, O. and GOTWAY, C. A. (2005)., Statistical Methods for Spatial Data Analysis. Boca Raton, FL: Chapman & Hall/CRC. 1068.62096SCHABENBERGER, O. and GOTWAY, C. A. (2005)., Statistical Methods for Spatial Data Analysis. Boca Raton, FL: Chapman & Hall/CRC. 1068.62096

31.

SOO, YUH-WEN. and BATES, D. M. (1992). Loosely coupled nonlinear least squares., Computational Statistics and Data Analysis, 14, 249–259. 0875.62281 10.1016/0167-9473(92)90177-HSOO, YUH-WEN. and BATES, D. M. (1992). Loosely coupled nonlinear least squares., Computational Statistics and Data Analysis, 14, 249–259. 0875.62281 10.1016/0167-9473(92)90177-H

32.

STEIN, M. (1999)., Interpolation of Spatial Data: Some Theory of Kriging. New York: Springer. MR1697409 0924.62100STEIN, M. (1999)., Interpolation of Spatial Data: Some Theory of Kriging. New York: Springer. MR1697409 0924.62100

33.

TELFORD, R. J. and BIRKS H. J. B. (2009). Evaluation of transfer functions in spatially structured environments., Quat. Sci. Rev. 28 1309–1316.TELFORD, R. J. and BIRKS H. J. B. (2009). Evaluation of transfer functions in spatially structured environments., Quat. Sci. Rev. 28 1309–1316.

34.

THIÉBAUX, H. J. and ZWIERS, F. W. (1984). Interpretation and estimation of effective sample size., J. Climate Appl. Meteor. 23, 800–811.THIÉBAUX, H. J. and ZWIERS, F. W. (1984). Interpretation and estimation of effective sample size., J. Climate Appl. Meteor. 23, 800–811.

35.

VALLEJOS, R. and OSORIO, F., (2014). Effective sample size for spatial process models., Spat. Stat. 9 66–92.VALLEJOS, R. and OSORIO, F., (2014). Effective sample size for spatial process models., Spat. Stat. 9 66–92.
Jonathan Acosta and Ronny Vallejos "Effective sample size for spatial regression models," Electronic Journal of Statistics 12(2), 3147-3180, (2018). https://doi.org/10.1214/18-EJS1460
Received: 1 September 2017; Published: 2018
Vol.12 • No. 2 • 2018
Back to Top