Electronic Journal of Statistics

Spatial logistic regression and change-of-support in Poisson point processes

A. Baddeley, M. Berman, N.I. Fisher, A. Hardegen, R.K. Milne, D. Schuhmacher, R. Shah, and R. Turner

Full-text: Open access

Abstract

In Geographical Information Systems, spatial point pattern data are often analysed by dividing space into pixels, recording the presence or absence of points in each pixel, and fitting a logistic regression. We study weaknesses of this approach, propose improvements, and demonstrate an application to prospective geology in Western Australia. Models based on different pixel grids are incompatible (a ‘change-of-support’ problem) unless the pixels are very small. On a fine pixel grid, a spatial logistic regression is approximately a Poisson point process with loglinear intensity; we give explicit distributional bounds. For a loglinear Poisson process, the optimal parameter estimator from pixel data is not spatial logistic regression, but complementary log-log regression with an offset depending on pixel area. If the pixel raster is randomly subsampled, logistic regression is conditionally optimal. Bias and efficiency depend strongly on the spatial regularity of the covariates. For discontinuous covariates, we propose a new algorithmic strategy in which pixels are subdivided, and demonstrate its efficiency.

Article information

Source
Electron. J. Statist. Volume 4 (2010), 1151-1201.

Dates
First available: 8 November 2010

Permanent link to this document
http://projecteuclid.org/euclid.ejs/1289226498

Digital Object Identifier
doi:10.1214/10-EJS581

Mathematical Reviews number (MathSciNet)
MR2735883

Zentralblatt MATH identifier
06166537

Subjects
Primary: 62H11: Directional data; spatial statistics 62J12: Generalized linear models
Secondary: 60G55: Point processes 62M30: Spatial processes 60E99: None of the above, but in this section

Keywords
Change of support complementary log-log regression ecological fallacy exponential family generalized linear models geographical information systems likelihood logistic regression missing information principle mixed pixels mixels modifiable area unit problem modulated Poisson process Poisson point process prospective geology prospectivity spatial point process spatial statistics split pixels Western Australia

Citation

Baddeley, A.; Berman, M.; Fisher, N.I.; Hardegen, A.; Milne, R.K.; Schuhmacher, D.; Shah, R.; Turner, R. Spatial logistic regression and change-of-support in Poisson point processes. Electronic Journal of Statistics 4 (2010), 1151--1201. doi:10.1214/10-EJS581. http://projecteuclid.org/euclid.ejs/1289226498.


Export citation

References

  • [1] F.P. Agterberg. Automatic contouring of geological maps to detect target areas for mineral exploration., Journal of the International Association for Mathematical Geology, 6:373–395, 1974.
  • [2] A. Albert and J.A. Anderson. On the existence of maximum likelihood estimates in logistic regression models., Biometrika, 71:1–10, 1984.
  • [3] J.E. Alt, G. King, and C.S. Signorino. Aggregation among binary, count and duration models: estimating the same quantities from different levels of data., Political Analysis, 9:21–44, 2001.
  • [4] R. Assunção., Robust estimation in point processes. PhD thesis, University of Washington, Seattle, 1994.
  • [5] R. Assunção and P. Guttorp. Robustness for inhomogeneous Poisson point processes., Annals of the Institute of Statistical Mathematics, 51:657–678, 1999.
  • [6] P.M. Atkinson and R. Massari. Generalised linear modelling of susceptibility to landsliding in the central Appenines, Italy., Computers and Geosciences, 24:373–385, 1998.
  • [7] A. Baddeley. Interpreting results of spatial logistic regression. In, preparation.
  • [8] A. Baddeley and R. Turner. Practical maximum pseudolikelihood for spatial point patterns (with discussion)., Australian and New Zealand Journal of Statistics, 42(3):283–322, 2000.
  • [9] A. Baddeley and R. Turner. Spatstat: an R package for analyzing spatial point patterns., Journal of Statistical Software, 12(6):1–42, 2005. URL: www.jstatsoft.org, ISSN: 1548-7660.
  • [10] A. Baddeley and R. Turner. Modelling spatial point patterns in R In A. Baddeley, P. Gregori, J. Mateu, R. Stoica, and D. Stoyan, editors, Case Studies in Spatial Point Pattern Modelling, number 185 in Lecture Notes in Statistics, pages 23–74. Springer-Verlag, New York, 2006. ISBN: 0-387-28311-0.
  • [11] S. Banerjee and A.E. Gelfand. Prediction, interpolation and regression for spatially misaligned data., Sanhkya A, 64:227–245, 2002.
  • [12] A.D. Barbour, L. Holst, and S. Janson., Poisson Approximation. Number 2 in Oxford Studies in Probability. Oxford University Press, 1992.
  • [13] O.E. Barndorff-Nielsen., Information and Exponential families in statistical theory. Wiley, Chichester, New York, Brisbane, Toronto, 1978.
  • [14] M. Berman and T.R. Turner. Approximating point process likelihoods with GLIM., Applied Statistics, 41:31–38, 1992.
  • [15] J. Besag, R. Milne, and S. Zachary. Point process limits of lattice processes., Journal of Applied Probability, 19:210–216, 1982.
  • [16] D.R. Brillinger. Comparative aspects of the study of ordinary time series and of point processes. In P.R. Krishnaiah, editor, Developments in Statistics, pages 33–133. Academic Press, 1978.
  • [17] D.R. Brillinger and H.K. Preisler. Two examples of quantal data analysis: a) multivariate point process, b) pure death process in an experimental design. In, Proceedings, XIII International Biometric Conference, Seattle, pages 94–113. International Biometric Society, 1986.
  • [18] D.R. Brillinger and J.P. Segundo. Empirical examination of the threshold model of neuron firing., Biological Cybernetics, 35:213–220, 1979.
  • [19] W.M. Brown, T.D. Gedeon, A.J. Baddeley, and D.I. Groves. Bivariate J-function and other graphical statistical methods help select the best predictor variables as inputs for a neural network method of mineral prospectivity mapping. In U. Bayer, H. Burger, and W. Skala, editors, IAMG 2002: 8th Annual Conference of the International Association for Mathematical Geology, volume 1, pages 257–268. International Association of Mathematical Geology, 2002.
  • [20] C.F. Chung and F.P. Agterberg. Regression models for estimating mineral resources from geological map data., Mathematical Geology, 12:473–488, 1980.
  • [21] C.F. Chung and A.G. Fabbri. Probabilistic prediction models for landslide hazard mapping., Photogrammetric engineering and remote sensing, 62(12) :1389–1399, 1999.
  • [22] M. Clyde and D. Strauss. Logistic regression for spatial pair-potential models. In A. Possolo, editor, Spatial Statistics and Imaging, volume 20 of Lecture Notes - Monograph series, chapter II, pages 14–30. Institute of Mathematical Statistics, 1991. ISBN 0-940600-27-7.
  • [23] D.R. Cox. The statistical analysis of dependencies in point processes. In P.A.W. Lewis, editor, Stochastic Point Processes, pages 55–66. Wiley, New York, 1972.
  • [24] N. Cressie. Change of support and the modifiable area unit problem., Geographical Systems, 3:159–180, 1996.
  • [25] N.A.C. Cressie., Statistics for Spatial Data. John Wiley and Sons, New York, 1991.
  • [26] D.J. Daley and D. Vere-Jones., An Introduction to the Theory of Point Processes. Volume I: Elementary Theory and Methods. Springer Verlag, New York, second edition, 2003.
  • [27] C.B. Dean and R. Balshaw. Efficiency lost by analyzing counts rather than exact times in Poisson and overdispersed Poisson regression., Journal of the American Statistical Association, 92 :1387–1398, 1997.
  • [28] A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete data via the E–M algorithm., Journal of the Royal Statistical Society B, 39:1–22, 1977.
  • [29] P.J. Diggle., Statistical Analysis of Spatial Point Patterns. Hodder Arnold, London, second edition, 2003.
  • [30] P.J. Diggle and B. Rowlingson. A conditional approach to point process modelling of elevated risk., Journal of the Royal Statistical Society, series A (Statistics in Society), 157(3):433–440, 1994.
  • [31] A.J. Dobson and A.G. Barnett., An introduction to generalized linear models. CRC Press, third edition, 2008.
  • [32] P. Elliott, J. Wakefield, N. Best, and D. Briggs, editors., Spatial Epidemiology: Methods and Applications. Oxford University Press, Oxford, 2000.
  • [33] R. Foxall and A. Baddeley. Nonparametric measures of association between a spatial point process and a random set, with geological applications., Applied Statistics, 51(2):165–182, 2002.
  • [34] C.J. Geyer. Likelihood inference for spatial point processes. In O.E. Barndorff-Nielsen, W.S. Kendall, and M.N.M. van Lieshout, editors, Stochastic Geometry: Likelihood and Computation, number 80 in Monographs on Statistics and Applied Probability, chapter 3, pages 79–140. Chapman and Hall / CRC, Boca Raton, Florida, 1999.
  • [35] P.V. Gorsevski, P.E. Gessler, R.B. Folz, and W.J. Elliott. Spatial prediction of landslide hazard using logistic regression and ROC analysis., Transactions in GIS, 10:395–415, 2006.
  • [36] C.A. Gotway and L.J. Young. Combining incompatible spatial data., Journal of the American Statistical Association, 97:632–648, 2002.
  • [37] D.I. Groves, R.J. Goldfarb, C.M. Knox-Robinson, J. Ojala, S. Gardoll, G.Y. Yun, and P. Holyland. Late-kinematic timing of orogenic gold deposits and significance for computer-based exploration techniques with emphasis on the Yilgarn Block, Western Australia., Ore Geology Reviews, 17:1–38, 2000.
  • [38] A. Hardegen., Efficient parameter estimation for spatial point processes. PhD thesis, University of Western Australia, 2010. In preparation.
  • [39] R.J. Hasenstab. A preliminary cultural resource sensitivity analysis for the proposed flood control facilities construction in the Passaic River basin of New Jersey. Technical report, Soil Systems, Inc, New York, 1983. Submitted to the Passaic River Basin Special Studies Branch, Department of the Army, USA. New York District Army Corps of, Engineers.
  • [40] W.W. Hauck, Jr. and A. Donner. Wald’s test as applied to hypotheses in logit analysis., J. Amer. Statist. Assoc., 72(360, part 1):851–853, 1977.
  • [41] H.J. Herzog. Change detection metrics and mixels (mixed picture elements) for computer analysis of low resolution remote sensing imagery. In, IEEE Region Six Conference Record, pages 17–20. IEEE, May 1977. Persistent link: http://ieeexplore.ieee.org/servlet/opac?punumber=5782.
  • [42] D.W. Hosmer and S. Lemeshow., Applied logistic regression. Wiley, second edition, 2000.
  • [43] J. Illian, A. Penttinen, H. Stoyan, and D. Stoyan., Statistical Analysis and Modelling of Spatial Point Patterns. John Wiley and Sons, Chichester, 2008.
  • [44] O. Kallenberg., Random measures. Akademie-Verlag, Berlin, fourth edition, 1986.
  • [45] A. Kitamoto and M. Takagi. Image classification using probabilistic models that reflect the internal structure of mixels., Pattern Analysis and Applications, 2:31–43, 1999.
  • [46] C.M. Knox-Robinson and D.I. Groves. Gold prospectivity mapping using a geographic information system (GIS), with examples from the Yilgarn Block of Western Australia., Chronique de la Recherche Minière, 529:127–138, 1997.
  • [47] K. Krickeberg. Processus ponctuels en statistique. In P.L. Hennequin, editor, Ecole d’Eté de Probabilités de Saint-Flour X, volume 929 of Lecture Notes in Mathematics, pages 205–313. Springer, 1982.
  • [48] Y.A. Kutoyants., Statistical Inference for Spatial Poisson Processes. Number 134 in Lecture Notes in Statistics. Springer, New York, 1998.
  • [49] K.L. Kvamme. Computer processing techniques for regional modeling of archaeological site locations., Advances in Computer Archaeology, 1:26–52, 1983.
  • [50] K.L. Kvamme. A view from across the water: the North American experience in archeological GIS. In G.R. Lock and Z. Stančič, editors, Archaeology and Geographical Information Systems: a European Perspective, pages 1–14. CRC Press, 1995.
  • [51] K.L. Kvamme. There and back again: revisiting archeological locational modeling. In M.W. Mehrer and K.L.Wescott, editors, GIS and archaeological site modelling, pages 3–40. CRC Press, 2006.
  • [52] C.R.O. Lawoko and G.J. McLachlan. Bias associated with the discriminant analysis approach to the estimation of mixing proportions., Pattern Recognition, 22:763–766, 1989.
  • [53] E.L. Lehmann., Theory of point estimation. John Wiley and Sons, New York, 1983.
  • [54] P.A.W. Lewis. Recent results in the statistical analysis of univariate point processes. In P.A.W. Lewis, editor, Stochastic point processes, pages 1–54. Wiley, New York, 1972.
  • [55] J.K. Lindsey., The analysis of stochastic processes using GLIM. Springer, Berlin, 1992.
  • [56] J.K. Lindsey., Modelling frequency and count data. Oxford University Press, 1995.
  • [57] J.K. Lindsey., Applying generalized linear models. Springer, 1997.
  • [58] P. McCullagh and J.A. Nelder., Generalized Linear Models. Chapman and Hall, second edition, 1989.
  • [59] J. Møller and R.P. Waagepetersen., Statistical Inference and Simulation for Spatial Point Processes. Chapman and Hall/CRC, Boca Raton, 2004.
  • [60] G.C. Ohlmacher and J.C. Davis. Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA., Engineering Geology, 69:331–343, 2003.
  • [61] S. Openshaw., The modifiable area unit problem. Geo Books, Norwich, 1984.
  • [62] T. Orchard and M.A. Woodbury. A missing information principle: theory and applications. In L.M. Le Cam, J. Neyman, and E.L. Scott, editors, Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 697–715, Berkeley, Calif., 1972. University of California Press.
  • [63] R Development Core Team., R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2004. ISBN 3-900051-07-0.
  • [64] S.L. Rathbun and N. Cressie. Asymptotic properties of estimators of the parameters of spatial inhomogeneous Poisson point processes., Advances in Applied Probability, 26:122–154, 1994.
  • [65] S.L. Rathbun, S. Shiffman, and C.J. Gwaltney. Modelling the effects of partially observed covariates on Poisson process intensity., Biometrika, 94:153–165, 2007.
  • [66] W.S. Robinson. Ecological correlations and the behavior of individuals., American Sociological Review, 15:351–357, 1950.
  • [67] R.T. Rockafellar., Convex Analysis. Princeton University Press, Princeton, NJ, 1972.
  • [68] R.T. Rockafellar and R.J.-B. Wets., Variational analysis. Springer, Berlin, 1998.
  • [69] S.C. Scholtz. Location choice models in Sparta. In R. Lafferty III, J.L. Ottinger, S.C. Scholtz, W.F. Limp, B. Watkins, and R. D. Jones, editors, Settlement Predictions in Sparta: A Locational Analysis and Cultural Resource Assessment on the Uplands of Calhoun County, Arkansas, number 14 in Arkansas Archaeological Survey Research Series, pages 207–222. Arkansas Archaeological Survey, Fayetteville, Arkansas, USA, 1981.
  • [70] D. Schuhmacher., Estimation of distances between point process distributions. PhD thesis, University of Zurich, Switzerland, 2005.
  • [71] D. Schuhmacher and A. Xia. A new metric between distributions of point processes., Advances in Applied Probability, 40:651–672, 2008.
  • [72] M. Silvapulle. On the existence and uniqueness of the maximum likelihood estimates for the binomial response models., Journal of the Royal Statistical Society, series B, 43:310–313, 1981.
  • [73] R. Sundberg. Maximum likelihood theory for incomplete data from an exponential family., Scandinavian Journal of Statistics, 1:49–58, 1974.
  • [74] J.W. Tukey. Discussion of paper by F.P. Agterberg and S.C. Robinson., Bulletin of the International Statistical Institute, 44(1):596, 1972. Proceedings, 38th Congress, International Statistical Institute.
  • [75] W.N. Venables and B.D. Ripley., Modern Applied Statistics with S-Plus. Springer, fourth edition, 2002.
  • [76] R. Waagepetersen. Estimating functions for inhomogeneous spatial point processes with incomplete covariate data., Biometrika, 95:351–363, 2008.
  • [77] J. Wakefield. A critique of statistical aspects of ecological studies in spatial epidemiology., Environmental and Ecological Statistics, 11:31–54, 2004.
  • [78] J. Wakefield. Disease mapping and spatial regression with count data., Biostatistics, 8:158–183, 2007.
  • [79] L.A. Waller and C.A. Gotway., Applied spatial statistics for public health data. Wiley, 2004.
  • [80] K.P. Watkins and A.H. Hickman. Geological evolution and mineralization of the Murchison Province, Western Australia. Bulletin 137, Geological Survey of Western Australia, 1990. Published by Department of Mines, Western Australia, 1990. Available online from Department of Industry and Resources, State Government of Western Australia, www.doir.wa.gov.au.
  • [81] R.W.M. Wedderburn. On the existence and uniqueness of maximum likelihood estimates for certain generalized linear models., Biometrika, 63:27–32, 1976.
  • [82] D. Wheatley and M. Gillings., Spatial technology and archaeology: the archaeological applications of GIS. Taylor and Francis, 2002.
  • [83] A. Xia. Stein’s method and Poisson process approximation. In, An introduction to Stein’s method, volume 4 of Lecture Notes Series, Institute of Mathematical Sciences, National University of Singapore, pages 115–181. Singapore University Press, Singapore, 2005.
  • [84] T. Yoshida and T. Hayashi. On the robust estimation in Poisson processes with periodic intensities., Annals of the Institute of Statistical Mathematics, 42:489–507, 1990.