## The Annals of Applied Statistics

### A coupled ETAS-I2GMM point process with applications to seismic fault detection

#### Abstract

Epidemic-type aftershock sequence (ETAS) point process is a common model for the occurrence of earthquake events. The ETAS model consists of a stationary background Poisson process modeling spontaneous earthquakes and a triggering kernel representing the space–time-magnitude distribution of aftershocks. Popular nonparametric methods for estimation of the background intensity include histograms and kernel density estimators. While these methods are able to capture local spatial heterogeneity in the intensity of spontaneous events, they do not capture well patterns resulting from fault line structure over larger spatial scales. Here we propose a two-layer infinite Gaussian mixture model for clustering of earthquake events into fault-like groups over intermediate spatial scales. We introduce a Monte Carlo expectation-maximization (EM) algorithm for joint inference of the ETAS-I2GMM model and then apply the model to the Southern California Earthquake Catalog. We illustrate the advantages of the ETAS-I2GMM model in terms of both goodness of fit of the intensity and recovery of fault line clusters in the Community Fault Model 3.0 from earthquake occurrence data.

#### Article information

Source
Ann. Appl. Stat., Volume 12, Number 3 (2018), 1853-1870.

Dates
Revised: January 2018
First available in Project Euclid: 11 September 2018

https://projecteuclid.org/euclid.aoas/1536652977

Digital Object Identifier
doi:10.1214/18-AOAS1134

Mathematical Reviews number (MathSciNet)
MR3852700

#### Citation

Cheng, Yicheng; Dundar, Murat; Mohler, George. A coupled ETAS-I 2 GMM point process with applications to seismic fault detection. Ann. Appl. Stat. 12 (2018), no. 3, 1853--1870. doi:10.1214/18-AOAS1134. https://projecteuclid.org/euclid.aoas/1536652977

#### References

• Adelfio, G. and Chiodi, M. (2015). Alternated estimation in semi-parametric space–time branching-type point processes with application to seismic catalogs. Stoch. Environ. Res. Risk Assess. 29 443–450.
• Andrews, J. L. and McNicholas, P. D. (2012). Model-based clustering, classification, and discriminant analysis via mixtures of multivariate $t$-distributions: The $t$EIGEN family. Stat. Comput. 22 1021–1029.
• Archambeau, C. and Verleysen, M. (2007). Robust Bayesian clustering. Neural Netw. 20 129–138.
• Baudry, J.-P., Raftery, A. E., Celeux, G., Lo, K. and Gottardo, R. (2010). Combining mixture components for clustering. J. Comput. Graph. Statist. 19 332–353.
• Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
• Figueiredo, M. A. and Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24 381–396.
• Forbes, F. and Wraith, D. (2014). A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: Application to robust clustering. Stat. Comput. 24 971–984.
• Gardner, J. K. and Knopoff, L. (1974). Is the sequence of earthquakes in Southern California, with aftershocks removed, Poissonian? Bull. Seismol. Soc. Am. 64 1363–1367.
• Ge, Y. and Sealfon, S. C. (2012). FlowPeaks: A fast unsupervised clustering for flow cytometry data via K-means and density peak finding. Bioinformatics 28 2052–2058.
• Hennig, C. (2010). Methods for merging Gaussian mixture components. Adv. Data Anal. Classif. 4 3–34.
• Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2 83–97.
• Lai, E., Moyer, D., Yuan, B., Fox, E., Hunter, B., Bertozzi, A. L. and Brantingham, J. (2014). Topic time series analysis of microblogs. Technical report, DTIC Document.
• Lee, S. and McLachlan, G. J. (2014). Finite mixtures of multivariate skew $t$-distributions: Some recent and new results. Stat. Comput. 24 181–202.
• Lewis, E. and Mohler, G. (2011). A nonparametric EM algorithm for multiscale Hawkes processes. Preprint available at http://paleo.sscnet.ucla.edu/Lewis-Molher-EM_Preprint.pdf.
• Marsan, D. and Lengline, O. (2008). Extending earthquakes’ reach through cascading. Science 319 1076–1079.
• Mohler, G. (2013). Modeling and estimation of multi-source clustering in crime and security data. Ann. Appl. Stat. 7 1525–1539.
• Mohler, G. (2014). Marked point process hotspot maps for homicide and gun crime prediction in Chicago. Int. J. Forecast. 30 491–497.
• Mohler, G. O., Short, M. B., Brantingham, P. J., Schoenberg, F. P. and Tita, G. E. (2011). Self-exciting point process modeling of crime. J. Amer. Statist. Assoc. 106 100–108.
• Mohler, G. O., Short, M. B., Malinowski, S., Johnson, M., Tita, G. E., Bertozzi, A. L. and Brantingham, P. J. (2015). Randomized controlled field trials of predictive policing. J. Amer. Statist. Assoc. 110 1399–1411.
• Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes. J. Amer. Statist. Assoc. 83 9–27.
• Ogata, Y. (1998). Space–time point-process models for earthquake occurrences. Ann. Inst. Statist. Math. 50 379–402.
• Peel, D. and McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Stat. Comput. 10 339–348.
• Plesch, A., Shaw, J. H., Benson, C., Bryant, W. A., Carena, S., Cooke, M., Dolan, J., Fuis, G., Gath, E. and Grant, L. (2007). Community fault model (CFM) for southern California. Bull. Seismol. Soc. Am. 97 1793–1802.
• Porter, M. D. and White, G. (2012). Self-exciting hurdle models for terrorist activity. Ann. Appl. Stat. 6 106–124.
• SCEDC (2013). Southern California earthquake center. https://service.scedc.caltech.edu/eq-catalogs/. Caltech.Dataset. DOI:10.7909/C3WD3xH1.
• Simma, A. and Jordan, M. I. (2012). Modeling events with cascades of Poisson processes. Preprint ArXiv:1203.3516.
• Spence, W., Sipkin, S. A. and Choy, G. L. (1989). Measuring the size of an earthquake. Earthquake Information Bulletin (USGS) 21 58–63.
• Stephens, M. (2000). Dealing with label switching in mixture models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 62 795–809.
• Sun, J., Kaban, A. and Garibaldi, J. M. (2010). Robust mixture modeling using the Pearson type VII distribution. In Neural Networks (IJCNN), the 2010 International Joint Conference on 1–7. IEEE.
• Svensén, M. and Bishop, C. M. (2005). Robust Bayesian mixture modelling. Neurocomputing 64 235–252.
• Utsu, T. (1961). A statistical study on the occurrence of aftershocks. Geophys. Mag. 30 521–605.
• Veen, A. and Schoenberg, F. P. (2008). Estimation of space–time branching process models in seismology using an EM-type algorithm. J. Amer. Statist. Assoc. 103 614–624.
• White, G. and Porter, M. D. (2014). GPU accelerated MCMC for modeling terrorist activity. Comput. Statist. Data Anal. 71 643–651.
• White, G., Porter, M. D. and Mazerolle, L. (2013). Terrorism risk, resilience, and volatility: A comparison of terrorism in three Southeast Asian countries. J. Quant. Criminol. 29 295–320.
• Yerebakan, H. Z., Rajwa, B. and Dundar, M. (2014). The infinite mixture of infinite Gaussian mixtures. In Advances in Neural Information Processing Systems 28–36.
• Zaliapin, I., Gabrielov, A., Keilis-Borok, V. and Wong, H. (2008). Clustering analysis of seismicity and aftershock identification. Phys. Rev. Lett. 101 018501.
• Zhao, Q., Erdogdu, M. A., He, H. Y., Rajaraman, A. and Leskovec, J. (2015). Seismic: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1513–1522. ACM.
• Zhuang, J. (2011). Next-day earthquake forecasts for the Japan region generated by the ETAS model. Earth Planets Space 63 5.
• Zhuang, J., Ogata, Y. and Vere-Jones, D. (2002). Stochastic declustering of space–time earthquake occurrences. J. Amer. Statist. Assoc. 97 369–380.