In network data analysis, multivariate spatial autoregression (MSAR) models may be used to analyze the autocorrelation among multiple responses. With large-scale networks, the estimation for MSAR on the entire network is computationally expensive. In this case, the subsampling method could be adopted. This approach selects a sample of nodes and then uses the estimate based on the sample to approximate the estimate on the full data. However, traditional sampling methods cannot obtain unbiased parameter estimates. Considering the second-order friend information of sampled nodes, we propose the crawling subsampling (CS) method for a general framework. Thereafter, based on the sampled data only, we construct the least-squares objective function. Under certain conditions, the computational complexity of optimizing the objective function is linear with the sample size . The identification condition for the parameters on the sampled network is theoretically provided. The sample size order requirement is provided, and the asymptotic properties of the least-squares estimators are investigated. The numerical results for the simulated and real data are presented to demonstrate the performance of the proposed CS method and least-squares estimator for the MSAR model.
Danyang Huang’s research is partially supported by the National Natural Science Foundation of China (NSFC, 12071477), building world-class universities (disciplines) of the Renmin University of China. Bingyi Jing’s research is partially supported by the HK RGC Grants GRF16302020, GRF 16304419, GRF 16305616, and the State Key Program of National Natural Science Foundation of China grant NSFC11631003. Bo Zhang is supported by the National Natural Science Foundation of China (NSFC,71873137), building world-class universities (disciplines) of the Renmin University of China.
"Crawling subsampling for multivariate spatial autoregression model in large-scale networks." Electron. J. Statist. 15 (2) 3678 - 3707, 2021. https://doi.org/10.1214/21-EJS1872