Brazilian Journal of Probability and Statistics

The reliability of statistical functions in four software packages freely used in numerical computation

Marcelo G. Almiron, Eliana S. Almeida, and Marcio N. Miranda

Source: Braz. J. Probab. Stat. Volume 23, Number 2 (2009), 107-119.

Abstract

This work presents a comparison of results about the accuracy of statistical routines from four statistical software packages that are freely used: Octave, academic Ox, Python, and R. Having extensive functional libraries for statistical computing with applications in image processing, these software packages are useful for data analysis and visualization. The National Institute of Standards and Technology datasets and McCullough’s methodology are used for assessing these packages. As to the statistical analysis herein performed, R yielded the best results and had the most comprehensive library.

Keywords: Statistical software; numerical computation; software accuracy

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.bjps/1256562753
Digital Object Identifier: doi:10.1214/08-BJPS017

References

Altman, M. (2002). A review of JMP 4.03 with special attention to its numerical accuracy., American Statistician 56 72–75.
Mathematical Reviews (MathSciNet): MR1939397
Digital Object Identifier: doi:10.1198/000313002753631402
Bustos, O. H. and Frery, A. C. (2006). Statistical functions and procedures in IDL 5.6 and 6.0., Computational Statistics & Data Analysis 50 301–310.
Mathematical Reviews (MathSciNet): MR2201864
Coppin, P., Jonckheere, I., Nackaerts, K., Muys, B. and Lambin, E. (2004). Digital change detection methods in ecosystem monitoring: A review., International Journal of Remote Sensing 25 1565–1596.
Fukunaga, K. (1990)., Introduction to Statistical Pattern Recognition, 2nd ed. Academic Press, Boston, MA.
Mathematical Reviews (MathSciNet): MR1075415
Zentralblatt MATH: 0711.62052
Keeling, K. B. & Pavur, R. J. (2007). A comparative study of the reliability of nine statistical software packages., Computational Statistics & Data Analysis 51 3811–3831.
Mathematical Reviews (MathSciNet): MR2364493
Knüsel, L. (1989). Computation of statistical distributions. Available at http://www.stat.uni-muenchen.de/~knuesel/, last visited in April, 2008.
Knüsel, L. (1998). On the accuracy of statistical distributions in Microsoft Excel 97., Computational Statistics & Data Analysis 26 375–377.
Marsaglia, G. (1998). The diehard battery of tests of randomness. Available at http://www.stat.fsu.edu/pub/diehard, last visited in April, 2008.
Marsaglia, G. (2003). Random number generators., Journal of Modern Applied Statistical Methods 2 2–13.
Marsaglia, G. and Tsang, W. W. (2002). Some difficult-to-pass tests of randomness., Journal of Statistical Software 7 1–8.
Matsumoto, M. and Nishimura, T. (1998). Mersenne-Twister: A 623-dimensionally equidistributed uniform pseudradom number generetor., ACM Transactions on Modeling and Computer Simulation 8 3–30.
McCullough, B. D. (1998). Assessing the reliability of statistical software: Part I., American Statistician 52 358–366.
McCullough, B. D. (2000). The accurary of Mathematica 4 as a statistical package., Computational Statistics 15(2) 279–299.
McCullough, B. D. and Heiser, D. A. (2008). On the accuracy of statistical procedures in Microsoft Excel 2007., Computational Statistics & Data Analysis 52 4570–4578.
McCullough, B. D. and Wilson, B. (1999). On the accuracy of statistical procedures in Microsoft Excel 97., Computational Statistics & Data Analysis 31 27–37.
McCullough, B. D. and Wilson, B. (2002). On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP., Computational Statistics & Data Analysis 40 713–721.
Mathematical Reviews (MathSciNet): MR1933481
McCullough, B. D. and Wilson, B. (2005). On the accuracy of statistical procedures in Microsoft Excel 2003., Computational Statistics & Data Analysis 49 1244–1252.
Mathematical Reviews (MathSciNet): MR2143068
NIST (2000). National Institute of Standards and Technology: The statistical reference datasets. Available at http://www.itl.nist.gov/div898/strd/, last visited in april, 2008.
Park, S. and Miller, K. (1988). Random number generators: Good ones are hard to find., Communications of the ACM 31 1192–1201.
Mathematical Reviews (MathSciNet): MR1022039
Digital Object Identifier: doi:10.1145/63039.63042
Wichmann, B. A. and Hill, I. D. (1982). Algorithm as 183: An efficient and portable pseudo-random number generator., Applied Statistics 31 188–190.
Wichmann, B. A. and Hill, I. D. (1984). Correction: Algorithm as 183: An efficient and portable pseudo-random number generator., Applied Statistics 33 123–123.
Yalta, A. T. (2007). The numerical reliability of GAUSS 8.0., American Statistician 61 262–268.
Yalta, A. T. (2008). The accuracy of statistical distributions in microsoft (r) excel 2007., Computational Statistics & Data Analysis 52 4579–4586.
Yalta, A. T. and Yalta, A. Y. (2007). GRETL 1.6.0 and its numerical accuracy., Journal of Applied Econometrics 22 849–854.
Mathematical Reviews (MathSciNet): MR2370977
Digital Object Identifier: doi:10.1002/jae.946

2009 © Brazilian Statistical Association