## Electronic Journal of Statistics

### Deconvolution estimation of mixture distributions with boundaries

#### Abstract

In this paper, motivated by an important problem in evolutionary biology, we develop two sieve type estimators for distributions that are mixtures of a finite number of discrete atoms and continuous distributions under the framework of measurement error models. While there is a large literature on deconvolution problems, only two articles have previously addressed the problem taken up in our article, and they use relatively standard Fourier deconvolution. As a result the estimators suggested in those two articles are degraded seriously by boundary effects and negativity. A major contribution of our article is correct handling of boundary effects; our method is asymptotically unbiased at the boundaries, and also is guaranteed to be nonnegative. We use roughness penalization to improve the smoothness of the resulting estimator and reduce the estimation variance. We illustrate the performance of the proposed estimators via our real driving application in evolutionary biology and two simulation studies. Furthermore, we establish asymptotic properties of the proposed estimators.

#### Article information

Source
Electron. J. Statist., Volume 7 (2013), 323-341.

Dates
First available in Project Euclid: 28 January 2013

https://projecteuclid.org/euclid.ejs/1359382682

Digital Object Identifier
doi:10.1214/13-EJS774

Mathematical Reviews number (MathSciNet)
MR3020423

Zentralblatt MATH identifier
1337.62068

#### Citation

Lee, Mihee; Hall, Peter; Shen, Haipeng; Marron, J. S.; Tolle, Jon; Burch, Christina. Deconvolution estimation of mixture distributions with boundaries. Electron. J. Statist. 7 (2013), 323--341. doi:10.1214/13-EJS774. https://projecteuclid.org/euclid.ejs/1359382682

#### References

• Akaike, H. (1974). A new look at the statistical model identification., IEEE transactions on automatic control 19, 716–723.
• Bertsekas, D. P. (2005)., Nonlinear Programming. Athena Scientific, second edition.
• Brown, L. and Greenshtein, E. (2009). Non parametric empirical Bayes and compound decision approaches to estimation of a high dimensional vector of normal means., Ann. Statist 37, 1685–1704.
• Burch, C. L. and Chao, L. (2004). Epistasis and its relationship to canalization in the RNA virus $\phi 6$., Genetics 167, 559–567.
• Burch, C. L., Guyader, S., Samarov, D., and Shen, H. (2007). Experimental estimate of the abundance and effects of nearly neutral mutations in the RNA virus $\phi 6$., Genetics 176, 467–476.
• Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. (2006)., Measurement error in nonlinear models: a modern perspective. CRC Press, 2 edition.
• Chaudhuri, P. and Marron, J. S. (2000). Scale space view of curve estimation., The Annals of Statistics 28, 408–428.
• Cordy, C. B. and Thomas, D. R. (1997). Deconvolution of a distribution function., Journal of the American Statistical Association 92, 1459–1465.
• Cuevas, A. and Walter, G. G. (1992). On estimation of generalized densities., Communications in Statistics-Theory and Methods 21, 1807–1821.
• Donoho, D. L. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures., The Annals of Statistics 32, 962–994.
• Greenshtein, E. and Park, J. (2009). Application of Non Parametric Empirical Bayes Estimation to High Dimensional Classification., Journal of Machine Learning Research 10, 1687–1704.
• Grenander, U. (1981)., Abstract Inference. Wiley $\&$ Sons.
• Gugushvili, S. and Van Es, B. and Spreij, P. (2011). Deconvolution for an atomic distribution: rates of convergence., Journal of Nonparametric Statistics. 23, 1003–1029.
• Hall, P. and Qiu, P. (2005). Discrete-transform approach to deconvolution problems., Biometrika 92, 135–148.
• Johnstone, I., Kerkyacharian, G., Picard, D., and Raimondo, M. (2004). Wavelet deconvolution in a periodic setting., Journal of the Royal Statistical Society. Series B 66, 547–573.
• Lee, M., Hall, P., Shen, H., Marron, J. S., Tolle, J., and Burch, C. (2013). Deconvolution estimation of mixture distributions with boundaries: supplementary materials., Available online at http://www.unc.edu/~haipeng/research/sieve-supp.pdf.
• Lee, M., Shen, H., Burch, C., and Marron, J. S. (2010). Direct deconvolution density estimation of a mixture distribution motivated by mutation effects distribution., Journal of Nonparametric Statistics 22, 1–22.
• McLachlan, G. J. and Peel, D. (2000)., Finite Mixture Models. John Wiley $\&$ Sons.
• Meister, A. (2007). Deconvolving compactly supported densities., Mathematical Methods of Statistics 16, 63–76.
• Pensky, M. (2002). Density deconvolution based on wavelets with bounded supports., Statistics $\&$ Probability Letters 56, 261–269.
• Raykar, V. C. and Zhao, L. H. (2011). Empirical bayesian thresholding for sparse signals using mixture loss functions., Statistica Sinica 21, 449–474.
• Ruppert, D., Nettleton, D., and Hwang, J. T. G. (2007). Exploring the information in $p$-values for the analysis and planning of multiple-test experiments., Biometrics 63, 483–495.
• Staudenmayer, J., Ruppert, D., and Buonaccorsi, J. P. (2008). Density estimation in the presence of heteroscedastic measurement error., Journal of the American Statistical Association 103, 726–736.
• van Es, B., Gugushvili, S., and Spreij, P. (2008). Deconvolution for an atomic distribution., Electronic Journal of Statistics 2, 265–297.
• Wagner, C. and Stadtmüller, U. (2008). Asymptotics for taylex and simex estimators in deconvolution of densities., Journal of Nonparametric Statistics 20, 507–522.
• Wang, X.-F., Sun, J., and Fan, Z. (2009). Deconvolution density estimation with heteroscedastic error using SIMEX., Submitted to Electronic Journal of Statistics.
• Zhang, S. and Karunamuni, R. J. (2009). Deconvolution boundary kernel method in nonparametric density estimation., Journal of Statistical Planning and Inference 139, 2269–2283.