Brazilian Journal of Probability and Statistics

Recent developments in complex and spatially correlated functional data

Israel Martínez-Hernández and Marc G. Genton

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

As high-dimensional and high-frequency data are being collected on a large scale, the development of new statistical models is being pushed forward. Functional data analysis provides the required statistical methods to deal with large-scale and complex data by assuming that data are continuous functions, for example, realizations of a continuous process (curves) or continuous random field (surfaces), and that each curve or surface is considered as a single observation. Here, we provide an overview of functional data analysis when data are complex and spatially correlated. We provide definitions and estimators of the first and second moments of the corresponding functional random variable. We present two main approaches: The first assumes that data are realizations of a functional random field, that is, each observation is a curve with a spatial component. We call them spatial functional data. The second approach assumes that data are continuous deterministic fields observed over time. In this case, one observation is a surface or manifold, and we call them surface time series. For these two approaches, we describe software available for the statistical analysis. We also present a data illustration, using a high-resolution wind speed simulated dataset, as an example of the two approaches. The functional data approach offers a new paradigm of data analysis, where the continuous processes or random fields are considered as a single entity. We consider this approach to be very valuable in the context of big data.

Article information

Source
Braz. J. Probab. Stat., Volume 34, Number 2 (2020), 204-229.

Dates
Received: January 2020
Accepted: January 2020
First available in Project Euclid: 4 May 2020

Permanent link to this document
https://projecteuclid.org/euclid.bjps/1588579218

Digital Object Identifier
doi:10.1214/20-BJPS466

Mathematical Reviews number (MathSciNet)
MR4093256

Keywords
Functional data functional random field manifold data spatial functional data spatial statistics spatio-temporal statistics surface data

Citation

Martínez-Hernández, Israel; Genton, Marc G. Recent developments in complex and spatially correlated functional data. Braz. J. Probab. Stat. 34 (2020), no. 2, 204--229. doi:10.1214/20-BJPS466. https://projecteuclid.org/euclid.bjps/1588579218


Export citation

References

  • Abdulah, S., Li, Y., Cao, J., Ltaief, H., Keyes, D. E., Genton, M. G. and Sun, Y. (2019). ExaGeoStatR: A package for large-scale geostatistics in R. Available at arXiv:1908.06936.
  • Aguilera-Morillo, M. C., Durbán, M. and Aguilera, A. M. (2017). Prediction of functional data with spatial dependence: A penalized approach. Stochastic Environmental Research and Risk Assessment 31, 7–22.
  • Alfeld, P., Neamtu, M. and Schumaker, L. L. (1996). Fitting scattered data on sphere-like surfaces using spherical splines. Journal of Computational and Applied Mathematics 73, 5–43.
  • Arnone, E., Azzimonti, L., Nobile, F. and Sangalli, L. M. (2019). Modeling spatially dependent functional data via regression with differential regularization. Journal of Multivariate Analysis 170, 275–295. Special Issue on Functional Data Analysis and Related Topics.
  • Aston, J. A. D., Pigoli, D. and Tavakoli, S. (2017). Tests for separability in nonparametric covariance operators of random surfaces. The Annals of Statistics 45, 1431–1461.
  • Aue, A., Norinho, D. D. and Hörmann, S. (2015). On the prediction of stationary functional time series. Journal of the American Statistical Association 110, 378–392.
  • Azzimonti, L., Sangalli, L. M., Secchi, P., Domanin, M. and Nobile, F. (2015). Blood flow velocity field estimation via spatial regression with PDE penalization. Journal of the American Statistical Association 110, 1057–1071.
  • Baladandayuthapani, V., Mallick, B. K., Hong, M. Y., Lupton, J. R., Turner, N. D. and Carroll, R. J. (2008). Bayesian hierarchical spatially correlated functional data analysis with application to colon carcinogenesis. Biometrics 64, 64–73.
  • Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2015). Hierarchical Modeling and Analysis for Spatial Data. Monographs on Statistics and Applied Probability. 135. Boca Raton, FL: CRC Press.
  • Bel, L., Bar-Hen, A., Petit, R. and Cheddadi, R. (2011). Spatio-temporal functional regression on paleoecological data. Journal of Applied Statistics 38, 695–704.
  • Bernardi, M. S., Sangalli, L. M., Mazza, G. and Ramsay, J. O. (2017). A penalized regression model for spatial functional data with application to the analysis of the production of waste in Venice province. Stochastic Environmental Research and Risk Assessment 31, 23–38.
  • Bohorquez, M., Giraldo, R. and Mateu, J. (2017). Multivariate functional random fields: Prediction and optimal sampling. Stochastic Environmental Research and Risk Assessment 31, 53–70.
  • Bosq, D. (2000). Linear Processes in Function Spaces: Theory and Applications. Lecture Notes in Statistics 149. New York: Springer.
  • Caballero, W., Giraldo, R. and Mateu, J. (2013). A universal kriging approach for spatial functional data. Stochastic Environmental Research and Risk Assessment 27, 1553–1563.
  • Cardot, H., Ferraty, F. and Sarda, P. (1999). Functional linear model. Statistics & Probability Letters 45, 11–22.
  • Chen, K., Zhang, X., Petersen, A. and Müller, H.-G. (2017). Quantifying infinite-dimensional data: Functional data analysis in action. Statistics in Biosciences 9, 582–604.
  • Crainiceanu, C. M., Caffo, B. S., Luo, S., Zipunnikov, V. M. and Punjabi, N. M. (2011). Population value decomposition, a framework for the analysis of image populations. Journal of the American Statistical Association 106, 775–790.
  • Cressie, N. and Wikle, C. K. (2011). Statistics for Spatio-Temporal Data. Wiley Series in Probability and Statistics. Hoboken, NJ: John Wiley & Sons, Inc.
  • Cressie, N. A. C. (2015). Statistics for Spatial Data, Revised ed. Wiley Classics Library. New York: John Wiley & Sons, Inc. Paperback edition of the 1993 edition.
  • Dassi, F., Ettinger, B., Perotto, S. and Sangalli, L. M. (2015). A mesh simplification strategy for a spatial regression analysis over the cortical surface of the brain. Applied Numerical Mathematics 90, 111–131.
  • Delicado, P., Giraldo, R., Comas, C. and Mateu, J. (2010). Statistics for spatial functional data: Some recent contributions. EnvironMetrics 21, 224–239.
  • Diggle, P. J. and Ribeiro, P. J. Jr. (2007). Model-Based Geostatistics. Springer Series in Statistics. New York: Springer.
  • Duchamp, T. and Stuetzle, W. (2003). Spline smoothing on surfaces. Journal of Computational and Graphical Statistics 12, 354–381.
  • Duchon, J. (1977). Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In Constructive Theory of Functions of Several Variables (W. Schempp and K. Zeller, eds.) 85–100. Berlin, Heidelberg: Springer.
  • Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with $B$-splines and penalties. Statistical Science 11, 89–121.
  • Ettinger, B., Perotto, S. and Sangalli, L. M. (2016). Spatial regression models over two-dimensional manifolds. Biometrika 103, 71–88.
  • Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Theory and Practice. Springer Series in Statistics. New York: Springer.
  • Finley, A. O., Banerjee, S. and Gelfand, A. E. (2015). spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software 63, 1–28.
  • Galeano, P. and Peña, D. (2019). Data science, big data and statistics. TEST 28, 289–329.
  • Genton, M. G., Johnson, C., Potter, K., Stenchikov, G. and Sun, Y. (2014). Surface boxplots. Stat 3, 1–11.
  • Genton, M. G. and Kleiber, W. (2015). Cross-covariance functions for multivariate geostatistics. Statistical Science 30, 147–163.
  • Giraldo, R., Dabo-Niang, S. and Martínez, S. (2018). Statistical modeling of spatial big data: An approach from a functional data analysis perspective. Statistics & Probability Letters 136, 126–129.
  • Giraldo, R., Delicado, P. and Mateu, J. (2010). Continuous time-varying kriging for spatial prediction of functional data: An environmental application. Journal of Agricultural, Biological, and Environmental Statistics 15, 66–82.
  • Giraldo, R., Delicado, P. and Mateu, J. (2011). Ordinary kriging for function-valued spatial data. Environmental and Ecological Statistics 18, 411–426.
  • Giraldo, R., Delicado, P. and Mateu, J. (2012). Hierarchical clustering of spatially correlated functional data. Statistica Neerlandica 66, 403–421.
  • Giraldo, R., Delicado, P. and Mateu, J. (2015). geofd: Spatial prediction for function value data. R package version 1.0.
  • Gneiting, T. (2013). Strictly and non-strictly positive definite functions on spheres. Bernoulli 19, 1327–1349.
  • Goulard, M. and Voltz, M. (1993) Geostatistical Interpolation of Curves: A Case Study in Soil Science, 805–816. Dordrecht: Springer.
  • Greco, F., Ventrucci, M. and Castelli, E. (2018). P-spline smoothing for spatial data collected worldwide. Spatial Statistics 27, 1–17.
  • Gromenko, O., Kokoszka, P., Zhu, L. and Sojka, J. (2012). Estimation and testing for spatially indexed curves with application to ionospheric and magnetic field trends. Annals of Applied Statistics 6, 669–696.
  • Grujic, O. and Menafoglio, A. (2017). fdagstat, an R package. R package version 1.0.
  • Grujic, O., Menafoglio, A., Yang, G. and Caers, J. (2018). Cokriging for multivariate Hilbert space valued random fields: Application to multi-fidelity computer code emulation. Stochastic Environmental Research and Risk Assessment 32, 1955–1971.
  • Haining, R. (2003). Spatial Data Analysis: Theory and Practice. Cambridge: Cambridge University Press.
  • Hall, P., Fisher, N. I. and Hoffmann, B. (1994). On the nonparametric estimation of covariance functions. The Annals of Statistics 22, 2115–2134.
  • Hall, P. and Patil, P. (1994). Properties of nonparametric estimators of autocovariance for stationary random fields. Probability Theory and Related Fields 99, 399–424.
  • Hörmann, S. and Kokoszka, P. (2010). Weakly dependent functional data. The Annals of Statistics 38, 1845–1884.
  • Hörmann, S. and Kokoszka, P. (2012). Chapter 7—functional time series. In Time Series Analysis: Methods and Applications (T. S. Rao, S. S. Rao and C. Rao, eds.), Handbook of Statistics 30, 157–186. Amsterdam: Elsevier.
  • Horváth, L. and Kokoszka, P. (2012). Inference for Functional Data with Applications. Springer Series in Statistics. New York: Springer.
  • Horváth, L., Kokoszka, P. and Reeder, R. (2013). Estimation of the mean of functional time series and a two-sample problem. Journal of the Royal Statistical Society, Series B, Statistical Methodology 75, 103–122.
  • Horváth, L., Kokoszka, P. and Rice, G. (2014). Testing stationarity of functional time series. Journal of Econometrics 179, 66–82.
  • Hyndman, R. J. and Ullah, M. S. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis 51, 4942–4956.
  • Ignaccolo, R., Mateu, J. and Giraldo, R. (2014). Kriging with external drift for functional data for air quality monitoring. Stochastic Environmental Research and Risk Assessment 28, 1171–1186.
  • Jiang, H. and Serban, N. (2012). Clustering random curves under spatial interdependence with application to service accessibility. Technometrics 54, 108–119.
  • Kokoszka, P. and Reimherr, M. (2013). Determining the order of the functional autoregressive model. Journal of Time Series Analysis 34, 116–129.
  • Kokoszka, P. and Reimherr, M. (2017). Introduction to Functional Data Analysis. Texts in Statistical Science Series. Boca Raton, FL: CRC Press.
  • Lai, M.-J. and Schumaker, L. L. (2007). Spline Functions on Triangulations. Encyclopedia of Mathematics and Its Applications. Cambridge: Cambridge University Press.
  • Lee, D.-J., Zhu, Z. and Toscas, P. (2015). Spatio-temporal functional data analysis for wireless sensor networks data. EnvironMetrics 26, 354–362.
  • Lila, E., Aston, J. A. D. and Sangalli, L. M. (2016). Smooth principal component analysis over two-dimensional manifolds with an application to neuroimaging. Annals of Applied Statistics 10, 1854–1879.
  • Lila, E., Sangalli, L. M., Ramsay, J. and Formaggia, L. (2019). fdaPDE: Functional data analysis and partial differential equations; statistical analysis of functional and spatial data, based on regression with partial differential regularizations. R package version 0.1-6.
  • Lindgren, F., Rue, H. and Lindström, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. Journal of the Royal Statistical Society, Series B, Statistical Methodology 73, 423–498.
  • Liu, C., Ray, S. and Hooker, G. (2017). Functional principal component analysis of spatially correlated data. Statistics and Computing 27, 1639–1654.
  • Martínez-Hernández, I. and Genton, M. G. (2020). Nonparametric trend estimation in functional time series with application to annual mortality rates. Available at arXiv:2001.04660.
  • Martínez-Hernández, I., Genton, M. G. and González-Farías, G. (2019). Robust depth-based estimation of the functional autoregressive model. Computational Statistics & Data Analysis 131, 66–79.
  • Mateu, J. and Romano, E. (2017). Advances in spatial functional statistics. Stochastic Environmental Research and Risk Assessment 31, 1–6.
  • Menafoglio, A., Gaetani, G. and Secchi, P. (2018). Random domain decompositions for object-oriented kriging over complex domains. Stochastic Environmental Research and Risk Assessment 32, 3421–3437.
  • Menafoglio, A., Grujic, O. and Caers, J. (2016). Universal kriging of functional data: Trace-variography vs cross-variography? Application to gas forecasting in unconventional shales. Spatial Statistics 15, 39–55.
  • Menafoglio, A. and Secchi, P. (2017). Statistical analysis of complex and spatially dependent data: A review of object oriented spatial statistics. European Journal of Operational Research 258, 401–410.
  • Menafoglio, A., Secchi, P. and Dalla Rosa, M. (2013). A universal kriging predictor for spatially dependent functional data of a Hilbert space. Electronic Journal of Statistics 7, 2209–2240.
  • Morris, J. S., Baladandayuthapani, V., Herrick, R. C., Sanna, P. and Gutstein, H. (2011). Automated analysis of quantitative image data using isomorphic functional mixed models, with application to proteomics data. Annals of Applied Statistics 5, 894–923.
  • Nerini, D., Monestiez, P. and Manté, C. (2010). Cokriging for spatial functional data. Journal of Multivariate Analysis 101, 409–418.
  • Nychka, D., Furrer, R., Paige, J. and Sain, S. (2017). fields: Tools for spatial data. R package version 9.9.
  • Pebesma, E. J. (2004). Multivariable geostatistics in S: The gstat package. Computers & Geosciences 30, 683–691.
  • Qingguo, T. and Longsheng, C. (2010). B-spline estimation for spatial data. Journal of Nonparametric Statistics 22, 197–217.
  • R Core Team (2019). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer Series in Statistics. New York: Springer.
  • Ramsay, J. O., Wickham, H., Graves, S. and Hooker, G. (2018). fda: Functional data analysis. R package version 2.4.8.
  • Ramsay, T. (2002). Spline smoothing over difficult regions. Journal of the Royal Statistical Society, Series B 64, 307–319.
  • Rekabdarkolaee, H. M., Krut, C., Fuentes, M. and Reich, B. J. (2019). A Bayesian multivariate functional model with spatially varying coefficient approach for modeling hurricane track data. Spatial Statistics 29, 351–365.
  • Reyes, A., Giraldo, R. and Mateu, J. (2015). Residual kriging for functional spatial prediction of salinity curves. Communications in Statistics Theory and Methods 44, 798–809.
  • Ribeiro, P. J. Jr. and Diggle, P. J. (2018). geoR: Analysis of geostatistical data. R package version 1.7-5.2.1.
  • Romano, E., Balzanella, A. and Verde, R. (2017). Spatial variability clustering for spatially dependent functional data. Statistics and Computing 27, 645–658.
  • Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society, Series B 71, 319–392.
  • Ruiz-Medina, M. (2011). Spatial autoregressive and moving average Hilbertian processes. Journal of Multivariate Analysis 102, 292–305.
  • Ruiz-Medina, M. (2012). New challenges in spatial and spatiotemporal functional statistics for high-dimensional data. Spatial Statistics 1, 82–91.
  • Ruiz-Medina, M. D., Salmerón, R. and Angulo, J. M. (2007). Kalman filtering from POP-based diagonalization of ARH(1). Computational Statistics & Data Analysis 51, 4994–5008.
  • Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge Series in Statistical and Probabilistic Mathematics 12. Cambridge: Cambridge University Press.
  • Sangalli, L. M. (2020). A novel approach to the analysis of spatial and functional data over complex domains. Quality Engineering 32, 181–190.
  • Sangalli, L. M., Ramsay, J. O. and Ramsay, T. O. (2013). Spatial spline regression models. Journal of the Royal Statistical Society, Series B 75, 681–703.
  • Sartori, I. and Torriani, L. (2019). Manifoldgstat, an R package. R package.
  • Schabenberger, O. and Gotway, C. A. (2005). Statistical Methods for Spatial Data Analysis. Texts in Statistical Science Series. London: Chapman & Hall/CRC.
  • Schlather, M., Malinowski, A., Oesting, M., Boecker, D., Strokorb, K., Engelke, S., Martini, J., Ballani, F., Moreva, O., Auel, J., Menck, P. J., Gross, S., Ober, U., Ribeiro, P., Ripley, B. D., Singleton, R. and Pfaff, B. (R Core Team) (2019). RandomFields: Simulation and analysis of random fields. R package version 3.3.6.
  • Scott-Hayward, L. A. S., Mackenzie, M. L., Donovan, C. R., Walker, C. G. and Ashe, E. (2014). Complex region spatial smoother (CReSS). Journal of Computational and Graphical Statistics 23, 340–360.
  • Song, J. J. and Mallick, B. (2019). Hierarchical Bayesian models for predicting spatially correlated curves. Statistics 53, 196–209.
  • Staicu, A.-M., Crainiceanu, C. M. and Carroll, R. J. (2010). Fast methods for spatially correlated multilevel functional data. Biostatistics 11, 177–194.
  • Stein, M. L. (1999). Interpolation of Spatial Data. Some Theory for Kriging. Springer Series in Statistics. New York: Springer.
  • Sun, Y. and Genton, M. G. (2011). Functional boxplots. Journal of Computational and Graphical Statistics 20, 316–334.
  • Sun, Y. and Genton, M. G. (2012). Adjusted functional boxplots for spatio-temporal data visualization and outlier detection. EnvironMetrics 23, 54–64.
  • Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. New York: Springer. ISBN 0-387-95457-0.
  • Wahba, G. (1981). Spline interpolation and smoothing on the sphere. SIAM Journal on Scientific and Statistical Computing 2, 5–16.
  • Wang, H. and Ranalli, M. G. (2007). Low-rank smoothing splines on complicated domains. Biometrics 63, 209–217.
  • Wilhelm, M., Dedè, L., Sangalli, L. M. and Wilhelm, P. (2016). IGS: An IsoGeometric approach for smoothing on surfaces. Computer Methods in Applied Mechanics and Engineering 302, 70–89. Cressie, N. A. C. (2015). Statistics for Spatial Data, Revised ed. Wiley Classics Library. New York: John Wiley & Sons, Inc. Paperback edition of the 1993 edition.
  • Wood, S. (2017). Generalized Additive Models: An Introduction with R, 2nd ed. London: Chapman & Hall.
  • Wood, S. N. (2006). Low-rank scale-invariant tensor product smooths for generalized additive mixed models. Biometrics 62, 1025–1036.
  • Wood, S. N., Bravington, M. V. and Hedley, S. L. (2008). Soap film smoothing. Journal of the Royal Statistical Society, Series B 70, 931–955.
  • Xiao, L., Li, Y. and Ruppert, D. (2013). Fast bivariate $P$-splines: The sandwich smoother. Journal of the Royal Statistical Society, Series B 75, 577–599.
  • Yip, C. M. A. (2018). Statistical characteristics and mapping of near-surface and elevated wind resources in the Middle East. Ph.D. Thesis, King Abdullah University of Science and Technology.
  • Yue, Y. and Speckman, P. L. (2010). Nonstationary spatial Gaussian Markov random fields. Journal of Computational and Graphical Statistics 19, 96–116.
  • Zhang, J., Clayton, M. K. and Townsend, P. A. (2011). Functional concurrent linear regression model for spatial images. Journal of Agricultural, Biological, and Environmental Statistics 16, 105–130.
  • Zhang, L., Baladandayuthapani, V., Zhu, H., Baggerly, K. A., Majewski, T., Czerniak, B. A. and Morris, J. S. (2016). Functional CAR models for large spatially correlated functional datasets. Journal of the American Statistical Association 111, 772–786.
  • Zhou, L., Huang, J. Z., Martinez, J. G., Maity, A., Baladandayuthapani, V. and Carroll, R. J. (2010). Reduced rank mixed effects models for spatially correlated hierarchical functional data. Journal of the American Statistical Association 105, 390–400.