Statistical Science

Some Statistical and Computational Challenges, and Opportunities in Astronomy

G. Jogesh Babu and S. George Djorgovski

Source: Statist. Sci. Volume 19, Number 2 (2004), 322-332.

Abstract

The data complexity and volume of astronomical findings have increased in recent decades due to major technological improvements in instrumentation and data collection methods. The contemporary astronomer is flooded with terabytes of raw data that produce enormous multidimensional catalogs of objects (stars, galaxies, quasars, etc.) numbering in the billions, with hundreds of measured numbers for each object. The astronomical community thus faces a key task: to enable efficient and objective scientific exploitation of enormous multifaceted data sets and the complex links between data and astrophysical theory. In recognition of this task, the National Virtual Observatory (NVO) initiative recently emerged to federate numerous large digital sky archives, and to develop tools to explore and understand these vast volumes of data. The effective use of such integrated massive data sets presents a variety of new challenging statistical and algorithmic problems that require methodological advances. An interdisciplinary team of statisticians, astronomers and computer scientists from The Pennsylvania State University, California Institute of Technology and Carnegie Mellon University is developing statistical methodology for the NVO. A brief glimpse into the Virtual Observatory and the work of the Penn State-led team is provided here.

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ss/1105714166
Digital Object Identifier: doi:10.1214/088342304000000774
Mathematical Reviews number (MathSciNet): MR2146945
Zentralblatt MATH identifier: 1100.85500

References

Babu, G. J. and Feigelson, E. D. (1996). Astrostatistics. Chapman and Hall, London.
Zentralblatt MATH: 0871.62104
Babu, G. J. and Feigelson, E. D., eds. (1997). Statistical Challenges in Modern Astronomy II. Springer, New York.
Zentralblatt MATH: 0872.00025
Banday, A. J., Zaroubi, S. and Bartelmann, M. L., eds. (2001). Mining the Sky: Proc. MPA/ESO/MPE Workshop. Springer, Heidelberg.
Bredekamp, J. H. and Golombek, D. A. (2003). NASA's astrophysics data environment. In Statistical Challenges in Astronomy (E. D. Feigelson and G. J. Babu, eds.) 103--112. Springer, New York.
Brunner, R. J., Djorgovski, S. G. and Szalay, A. S., eds. (2001). Virtual Observatories of the Future. Astronomical Society of the Pacific, San Francisco.
Djorgovski, S. G., Brunner, R., Mahabal, A., Williams, R., Granat, R. and Stolorz, P. (2003). Challenges for cluster analysis in a virtual observatory. In Statistical Challenges in Astronomy (E. D. Feigelson and G. J. Babu, eds.) 127--138. Springer, New York.
Zentralblatt MATH: 1100.85500
Djorgovski, S. G., Mahabal, A., Brunner, R., Gal, R. R., Castro, S., de Carvalho, R. R. and Odewahn, S. C. (2001). Searches for rare and new types of objects. In Virtual Observatories of the Future (R. Brunner, S. G. Djorgovski and A. Szalay, eds.) 52--63. Astronomical Society of the Pacific, San Francisco.
Feigelson, E. D. and Babu, G. J., eds. (1992). Statistical Challenges in Modern Astronomy. Springer, New York.
Zentralblatt MATH: 0872.00025
Feigelson, E. D. and Babu, G. J., eds. (2003). Statistical Challenges in Astronomy. Springer, New York.
Zentralblatt MATH: 0872.00025
Genovese, C. R. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve. Ann. Statist. 28 1105--1127.
Mathematical Reviews (MathSciNet): MR1810921
Digital Object Identifier: doi:10.1214/aos/1015956709
Project Euclid: euclid.aos/1015956709
Zentralblatt MATH: 1105.62333
Hald, A. (1990). A History of Probability and Statistics and Their Applications before 1750. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1029276
Zentralblatt MATH: 0731.01001
Hornschemeier, A. E. et al. (2000). X-ray sources in the Hubble deep field detected by Chandra. Astrophysical J. 541 49--53.
Jaschek, C. and Murtagh, F., eds. (1990). Errors, Bias and Uncertainties in Astronomy. Cambridge Univ. Press.
Liechty, J. C., Lin, D. K. J. and McDermott, J. P. (2003). Single-pass low-storage arbitrary quantile estimation for massive datasets. Statist. Comput. 13 91--100.
Mathematical Reviews (MathSciNet): MR1963325
Digital Object Identifier: doi:10.1023/A:1023296123228
Martinez, V. J. and Saar, E. (2001). Statistics of the Galaxy Distribution. Chapman and Hall, New York.
Nichol, R. C. et al. (2003). Computational astrostatistics: Fast and efficient tools for analyzing huge astronomical data sources. In Statistical Challenges in Astronomy (E. D. Feigelson and G. J. Babu, eds.) 265--276. Springer, New York.
Rolfe, E. J., ed. (1983). Statistical methods in astronomy. Publication ESA SP 201, European Space Agency Scientific & Technical Publications, Noordwijk, Netherlands.
Strauss, M. A. (2003). Statistical and astronomical challenges in the Sloan digital sky survey. In Statistical Challenges in Astronomy (E. D. Feigelson and G. J. Babu, eds.) 113--123. Springer, New York.
Subba Rao, T., Priestley, M. B. and Lessi, O. (1997). Applications of Times Series Analysis in Astronomy and Meteorology. Chapman and Hall, London.
Szalay, A. S. and Matsubara, T. (2003). Analyzing large data sets in cosmology. In Statistical Challenges in Astronomy (E. D. Feigelson and G. J. Babu, eds.) 161--174. Springer, New York.
Taylor, J. and McKee, C. (2000). Astronomy and Astrophysics in the New Millenium. Natl. Acad. Sci.--Natl. Res. Council Press, Washington, DC.
Zwicky, F. (1957). Morphological Astronomy. Springer, Berlin.

2010 © Institute of Mathematical Statistics