## Electronic Journal of Statistics

### Effective sample size for spatial regression models

#### Abstract

We propose a new definition of effective sample size. Although the recent works of Griffith (2005, 2008) and Vallejos and Osorio (2014) provide a theoretical framework to address the reduction of information in a spatial sample due to spatial autocorrelation, the asymptotic properties of the estimations have not been studied in those studies or in previously ones. In addition, the concept of effective sample size has been developed primarily for spatial regression processes with a constant mean. This paper introduces a new definition of effective sample size for general spatial regression models that is coherent with previous definitions. The asymptotic normality of the maximum likelihood estimation is obtained under an increasing domain framework. In particular, the conditions for which the limiting distribution holds are established for the Matérn covariance family. Illustrative examples accompany the discussion of the limiting results, including some cases where the asymptotic variance has a closed form. The asymptotic normality leads to an approximate hypothesis testing that establishes whether there is redundant information in the sample. Simulation results support the theoretical findings and provide information about the behavior of the power of the suggested test. A real dataset in which a transect sampling scheme has been used is analyzed to estimate the effective sample size when a spatial linear regression model is assumed.

#### Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 3147-3180.

Dates
First available in Project Euclid: 27 September 2018

https://projecteuclid.org/euclid.ejs/1538013686

Digital Object Identifier
doi:10.1214/18-EJS1460

#### Citation

Acosta, Jonathan; Vallejos, Ronny. Effective sample size for spatial regression models. Electron. J. Statist. 12 (2018), no. 2, 3147--3180. doi:10.1214/18-EJS1460. https://projecteuclid.org/euclid.ejs/1538013686

#### References

• ACOSTA, J., OSORIO, F. and VALLEJOS, R. (2016). Effective sample size for line transect sampling models with an application to marine macroalgae., Journal of Agricultural, Biological and Environmental Statistics, 21, 407–425.
• BANERJEE, S., CARLIN B. and GELFAND A. (2004), Hierarchical Modeling and Analysis for Spatial Data, Boca Raton: Chapman Hall/CRC.
• BERGER, J., BAYARRI, M. J. and PERICCHI, L. R. (2014). The effective sample size., Econometric Letters 33 197–2014.
• BOX, G.E.P. (1954a). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one way classification., Ann. Math. Statist., 25, 290–302.
• BOX, G.E.P. (1954b). Some theorems on quadratic forms applied in the study of analysis of variance problems: II. Effects of inequality of variance and of correlation between errors in the two way classification., Ann. Math. Statist., 25, 484–498.
• BOX, G. E. P. and Cox, D. R. (1964). An analysis of transformations., Journal of the Royal Statistical Society. Series B, 26, 211–252.
• CLIFFORD, P., RICHARDSON, S. and HÉMON, D. (1989). Assessing the significance of the correlation between two spatial processes., Biometrics, 45, 123–134.
• COGLEY, J. G. (1999). Effective sample size for glacier mass balance., Geogr. Ann. A 81 497–507.
• CRESSIE, N. (1993)., Statistics for Spatial Data. New York: Wiley.
• CRESSIE, N. AND LAHIRI, S. N. (1993). Asymptotic distribution of REML estimators., Journal of Multivariate Analysis, 45, 217–233.
• CRUJEIRAS, R. M. and VAN KEILEGOM, I. (2010). Least squares estimation of nonlinear spatial trends., Computational Statistics and Data Analysis, 54, 452–465.
• DALE, M. R. T. and FORTIN, M-J. (2009). Spatial autocorrelation and statistical tests: some solutions., Journal of Agricultural, Biological, and Environmental Statistics, 14, 188–206.
• de GRUIJTER, J. J. and ter BRAAK, C. J. F. (1990). Model-free estimation from spatial samples: A reappraisal of classical sampling theory., Mathematical Geology, 22, 407–415.
• DUTILLEUL, P. (1993). Modifying the $t$ test for assessing the correlation between two spatial processes., Biometrics, 49, 305–314.
• DUTILLEUL, P., PELLETIER, B. and ALPARGU, G. (2008). Modified $F$ tests for assessing the multiple correlation between one spatial process and several others., Journal of Statistical Planning and Inference, 138, 1402–1415.
• FAES, C., MOLENBERGHS, G., AERTS, M., VERBEKE, G. and KENWARD, M. (2009). The effective sample size and an alternative small-sample degrees-of-freedom method., The American Statistician 63 389–399.
• GELFAND A. and VOUATSOU, P. (2003). Peoper multivariate conditional autoregressive models for spatial data analysis., Biostatistics, 4, 11–25.
• GELFAND, A. E., DIGGLE, P. J., FUENTES, M. and GUTTORP, P. (2010)., Handbook of Spatial Statistics, Boca Raton, Fl: Chapman & Hall/CRC.
• GOLUB, G. H. and PEREYRA, V. (1973). The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate., SIAM Journal of Numerical Analysis, 22, 413–431.
• GRIFFITH, D. (2005). Effective geographic sample size in the presence of spatial autocorrelation., Ann. Assoc. Amer. Geogr. 95 740–760.
• GRIFFITH, D. (2008). Geographic sampling of urban soils for contaminant mapping: how many samples and from where., Environ. Geochem. Health. 30 495–509.
• GRIFFITH, D. and PAELINCK, J. H. P. (2011)., Non-Stardard Spatial Statistics. New York: Springer.
• Haining, R. (1990), Spatial Data Analysis in the Social Environmental Sciences. Cambridge: Cambridge University Press.
• KUTNER, M., NACHTSHEIM, C., NETER, J. and LI, W. (2004), Applied Linear Statistical Models, Homewood, IL: McGraw-Hill/Irwin.
• KYUNG, M. and GHOSH, S. K. (2010). Maximum likelihood estimation for directional conditionally autoregressive models., Journal of Statistical Planning and Inference 140 3160–3179.
• LEDOIT, O. and WOLF, M. (2002). Some Hypothesis Tests for the covariance matrix when the dimension is large compared to the sample size., The Annals of Statistics, 30, 1081–1102.
• LENTH, R. V. (2001). Some practical guidelines for effective sample size determination., The American Statistician, 55, 187–193.
• MARDIA, K. and MARSHALL, R. (1984). Maximum likelihood of models for residual covariance in spatial regression., Biometrika 71 135–146.
• RICHARDSON, S. (1990). Some remarks on the testing of association between spatial processes. In: Griffith, D. (Ed.), Spatial Statistics: Past, Present, and Future., Institute of Mathematical Geography, Ann Arbor, MI, 277?309.
• SCHABENBERGER, O. and GOTWAY, C. A. (2005)., Statistical Methods for Spatial Data Analysis. Boca Raton, FL: Chapman & Hall/CRC.
• SOO, YUH-WEN. and BATES, D. M. (1992). Loosely coupled nonlinear least squares., Computational Statistics and Data Analysis, 14, 249–259.
• STEIN, M. (1999)., Interpolation of Spatial Data: Some Theory of Kriging. New York: Springer.
• TELFORD, R. J. and BIRKS H. J. B. (2009). Evaluation of transfer functions in spatially structured environments., Quat. Sci. Rev. 28 1309–1316.
• THIÉBAUX, H. J. and ZWIERS, F. W. (1984). Interpretation and estimation of effective sample size., J. Climate Appl. Meteor. 23, 800–811.
• VALLEJOS, R. and OSORIO, F., (2014). Effective sample size for spatial process models., Spat. Stat. 9 66–92.