Open Access
June 2011 Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations
Jo Bovy, David W. Hogg, Sam T. Roweis
Ann. Appl. Stat. 5(2B): 1657-1677 (June 2011). DOI: 10.1214/10-AOAS439

Abstract

We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation–Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or “underlying” distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a “split-and-merge” procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional velocity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.

Citation

Download Citation

Jo Bovy. David W. Hogg. Sam T. Roweis. "Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations." Ann. Appl. Stat. 5 (2B) 1657 - 1677, June 2011. https://doi.org/10.1214/10-AOAS439

Information

Published: June 2011
First available in Project Euclid: 13 July 2011

zbMATH: 1223.62029
MathSciNet: MR2849790
Digital Object Identifier: 10.1214/10-AOAS439

Keywords: Bayesian inference , Density estimation , Expectation–Maximization , missing data , multivariate estimation , noise

Rights: Copyright © 2011 Institute of Mathematical Statistics

Vol.5 • No. 2B • June 2011
Back to Top