References
[1] Banks, D., L. House, P. Arabie, F.R. McMorris, and W. Gaul, eds. 2004. Classification, Cluster Analysis, and Data Mining, Springer-Verlag, Berlin.
[2] Banks, D. 2007. Lectures on Statistical Data Mining, Duke University, Aug. 29–Nov. 28. http://www.stat.duke.edu/~banks/218-lectures.dir/
[3] Bauer, E. and Kohavi, R. 1999. ‘An Empirical Comparison of Voting Classification Algorithms,’ Machine Learning, 36, No. 1/2, 105–139.
[4] Buehlmann, P. and B. Yu. 2002. ‘Analyzing Bagging’ The Annals of Statistics 30: 927–61.
[5] Berk, R. 2006. ‘An Introduction to Ensemble Methods for Data Analysis.’ Sociological Methods and Research, 34: 3, (February), 263–95.
[6] Berk, R., A. Li and L. Hickman. 2005. ‘Statistical Difficulties in Determining the Role of Race in Capital Cases’, Journal of Quantitative Criminology, 21: 4, 365–390.
[7] Biau, G., L. Devroye, and G. Lugosi. ‘Consistency of Random Forests and other averaging classifiers.’ Preprint, October 10, 2007.
[8] Breiman, L. and A. Cutler, RAF: http://www.math.usu.edu/~adele/forests/cc_graphics.htm
[9] Breiman, L., J.H. Friedman, R.A. Olshen, and C.J. Stone. 1984. Classification and Regression Trees. Monterey, CA: Wadsworth.
[10] Breiman, L., and P. Spector. 1992. ‘Submodel selection and evaluation in regression: The X-random case,’ International Statistical Review, 60: 291–319.
[11] Breiman, L. 1996a. ‘Bagging Predictors.’ Machine Learning 26: 123–40.
[12] Breiman, L. 1996b. ‘Out-of-Bag Estimation.’ ftp://ftp.stat.berkeley.edu/pub/users/breiman/OOBestimation.ps.
[13] Breiman, L. 1999. ‘Random Forests–Random Features.’ UC Berkeley, Statistics Department, Technical Report N. 567.
[14] Breiman, L. 2001a. ‘Random Forests.’ Machine Learning 45: 5–32.
[15] Breiman, L. 2001b. ‘Statistical Modeling: Two Cultures’ (with discussion). Statistical Science 16: 199–231.
[16] Breiman, L. 2001c. ‘Wald Lecture I: Machine Learning’ and ‘Wald Lecture II: Looking Inside The Black Box’ ftp://ftp.stat.berkeley.edu/pub/users/breiman/.
[17] Breiman, L. 2004a. ‘Consistency For A Simple Model Of Random Forests,’ Technical Report 670, Statistics Department University Of California at Berkeley, September 9, 2004.
[18] Breiman, L. and A. Cutler. 2004. ‘Random Forests’ http://statwww.berkeley.edu/users/breiman/RandomForests/cc_home.htm.
[19] Breitenbach, M., R. Nielsen and G. Grudic ‘Probabilistic Random Forests: Predicting Data Point Specific Misclassification Probabilities,’ Available at http://www.cs.colorado.edu/department/publications/reports/docs/CU-CS-954-03.pdf. MATLAB code available at: http://markus-breitenbach.com/machine_learning_code.php.
[20] Buehlmann, P. and Bin Yu. 2002. ‘Analyzing Bagging.’ The Annals of Statistics 30: 927–61.
[21] Bylander, T. 2002. ‘Estimating Generalization Error on Two-Class Datasets Using Out-of-Bag Estimates,’ Machine Learning 48, 1–3, p. 287–297.
[22] Chan, J.C-W. and D. Paelinckx. 2008. ‘Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery,’ Remote Sensing of Environment 112, 6, 16 June 2008, 2999–3011.
[23] Cochran, W.G., and D.B. Rubin. 1973. Controlling bias in observational studies: A review. Sankhya: The Indian Journal of Statistics, Series A 35(Part 4): 417–66.
[24] Cutler, A. and L. Breiman, RAFT: RAndom Forest Tool, Available at: http://www.stat.berkeley.edu/users/breiman/RandomForests/.
[25] L. Devroye, L. Gyorfi, and G. Lugosi. 1996. A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.
[26] Diaz-Uriarte, R. 2007. ‘GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forest, BMC Bioinformatics, 8: 328.
[27] Dietterich, T. 1998. ‘An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization’, Machine Learning, 1–22.
[28] Dietterich, T. 2002. ‘Ensemble Learning,’ In The Handbook of Brain Theory and Neural Networks, Second edition, (M.A. Arbib, Ed.), Cambridge, MA: The MIT Press, 405–408.
[29] Dietterich, T. 2007. ‘Ensemble Methods in Machine Learning,’ Available at: eecs.oregonstate.edu/~tgd/publications/mcs-ensembles.ps.gz.
[30] Efron, B. 1979. ‘Bootstrap methods: another look at the jackknife,’ The Annals of Statistics 7: 1–26.
[31] Efron, B. and G. Gong. 1983. ‘A leisurely look at the bootstrap, the jackknife, and cross-validation,’ The American Statistician 37: 36–48.
[32] Freund, Y. and R. Schapire. 1996. ‘Experiments with a new boosting algorithm’, Machine Learning: Proceedings of the 13th International Conference, 148–156.
[33] Friedman, J.H., T. Hastie, and R. Tibsharini. 2000. ‘Additive Logistic Regression: A Statistical View of Boosting’ (with discussion). Annals of Statistics 28: 337–407.
[34] Friedman, J.H., T. Hastie, and R. Tibsharini. 2001. ‘Greedy Function Approximation: A Gradient Boosting Machine.’ Annals of Statistics 29: 1189–1232.
[35] Friedman, J.H., T. Hastie, and R. Tibsharini. 2002. ‘Stochastic Gradient Boosting.’ Computational Statistics and Data Analysis 38: 4, 367–78.
[36] Frölich, M. 2004. ‘Finite sample properties of propensity score matching and weighting estimators,’ Review of Econometrics and Statistics 86: 77–90.
[37] Grandvalet, Y. 2004. ‘Bagging Equalizes Influence.’ Machine Learning 55: 251–70.
[38] Hastie, T., R. Tibshirani, and J. Friedman. 2001[2009]. The Elements of Statistical Learning. New York: Springer-Verlag.
[39] Ho, D., K. Imai, G. King, and E. Stuart. 2007. ‘Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference,’ Political Analysis, 15: 199–236.
[40] Ho, T.K. 1995. ‘Random Decision Forest’. Proceedings of the 3rd International Conf. on Document Analysis and Recognition, Montreal, Canada, August 14–18, 1995, 278–282.
[41] Hothorn, T. and B. Lausen. 2003. ‘Double-bagging: Combining classifiers by bootstrap aggregation,’ Pattern Recognition, 36: 6, 1303–1309.
[42] Hothorn, T., B. Lausen, A. Benner and Ma. Radespiel-Troeger. 2004. ‘Bagging Survival Trees’. Statistics in Medicine, 23: 1, 77–91.
[43] Hothorn, T., P. Buhlmann, S. Dudoit, A. Molinaro and M.J. van der Laan. 2006. ‘Survival Ensembles’. Biostatistics, 7: 3, 355–373.
[44] Hothorn, T. and A. Peters, 2009. ipred, http://cran.r-project.org/web/packages/ipred/index.html
[45] Ishwaran, H. and U. Kogalur. 2007. randomSurvivalForest (R software for random survival forest) Ensemble survival analysis based on a random forest of trees using random inputs. Version 3.0.1.
[46] Karpievitch, Y.V., A.P. Leclerc, E.G. Hill, J.S. Almeida, ‘RF++: Improved Random Forest for Clustered Data Classification,’ http://www.ohloh.net/p/rfpp
[47] Kumar, Manish and M. Thenmozhi, ‘Forecasting Stock Index Movement: A Comparison of Support Vector Machines and Random Forest,’ Indian Institute of Capital Markets 9th Capital Markets Conference Paper Available at SSRN: http://ssrn.com/abstract=876544.
[48] LeBlanc, M. and R. Tibshirani. 1996. ‘Combining Estimates on Regression and Classification.’ Journal of the American Statistical Association 91: 1641–50.
[49] Leshem, G. 2005. ‘Improvement of Adaboost Algorithm by using Random Forests as Weak Learner.’ Ph.D. Thesis, Hebrew University of Jerusalem: shum.huji.ac.il/~gleshem/Guy_Leshem_Proposal.pdf
[50] Liaw, A. and M. Wiener. ‘Classification and Regression by randomForest’ R News (2002) Vol. 2/3 p. 18 (Discussion of the use of the random forest package for R).
[51] Liaw, A. and M. Weiner. 2007. randomForest (R software for random forest). Fortran original (L. Breiman and A. Cutler), R port (A. Liaw and M. Wiener) Version 4.5-19 and 4.5-25. http://cran.r-project.org/web/packages/randomForest/index.html
[52] Lin, Y. and Y. Jeon. 2006. ‘Random Forests and adaptive nearest neighbors,’ Journal of the American Statistical Association, 101 (474): 578–590.
[53] Loh, W.-Y. 2002. ‘Regression Trees With Unbiased Variable Selection and Interaction Detection.’ Statistica Sinica 12: 361–86.
[54] Mannor, S., R. Meir and T. Zhang. 2002. ‘The Consistency of Greedy Algorithms for Classification,’ COLT, 319–333.
[55] Meinshausen, N. 2006. ‘Quantile regression forests,’ Journal of Machine Learning Research, 7: 983–999.
[56] Nyuyen, T.T. 2008. ‘Outlier and Exception Analysis in Rough Sets and Granular Computing,’ in Handbook of Granular Computing (Eds. W Pedrycz, A. Skowron, V. Kreinovich), Wiley 2008.
[57] Opitz, D. and R. Maclin. 1999. ‘Popular Ensemble Methods: An Empirical Study’, Journal of Artificial Intelligence Research, 11, 169–198, citeseer.ist.psu.edu/opitz99popular.html.
[58] Peters, A. and T. Hothorn. 2007. ipred: Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error. (R software for random forest prediction). Version: 0.8-5
[59] , Picard, R. and D. Cook. 1984. ‘Cross-Validation of Regression Models,’ Journal of the American Statistical Association 79 (387): 575–583.
[60] Quinlan, R. 1993. C4.5: Programs for Machine Learning (Morgan Kaufmann)
[61] Rosenbaum, P.R. 1984. ‘The consequences of adjusting for a concomitant variable that has been affected by the treatment,’ Journal of the Royal Statistical Society, Series A 147: 656–66.
[62] Rosenbaum, P.R. 1989. ‘Optimal matching for observational studies,’ Journal of the American Statistical Association 84: 1024–1032.
[63] Rosenbaum, P.R. 2002. Observational studies. 2nd ed. New York: Springer.
[64] Rosenbaum, P.R., and D.B. Rubin. 1983. ‘The central role of the propensity score in observational studies for causal effects,’ Biometrika 70: 41–55.
[65] Sandri, M. and P. Zuccolotto. 2009. ‘Variable selection using Random Forests,’ Typescript, 8 pages.
[66] Schapire, R.E. 1990. ‘The strength of weak learnability,’ Machine Learning, 5: 197–227.
[67] Schapire, R. E. 1999. ‘A Brief Introduction to Boosting.’ In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence.
[68] Schapire, R.E., Y. Freund, P. Bartlett, and W.S. Lee. 1998. ‘Boosting the margin: A new explanation for the effectiveness of voting methods,’ The Annals of Statistics, 26: 1651–1686.
[69] Shannon, W., and D. Banks. 1997. ‘An MLE Strategy for Combining CART Models,’ Computing Science and Statistics, 29: 540–544.
[70] Shi, T., Seligson, D. Belldegrun, A.S. Palotie, A. and Horvath, S. 2005. ‘Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma,’ Modern Pathology 18: 4, 547–57.
[71] Siroky, D.S. 2009. Secession and Survival, Ph.D. Dissertation, Duke University.
[72] Strobl, C., A. Boulesteix, A. Zeileis and T. Hothorn. 2007. Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution. BMC Bioinformatics, 8, 25. http://www.biomedcentral.com/1471-2105/8/25/abstract.
[73] Strobl, C. and A. Zeileis. 2008. ‘Danger: High Power! – Exploring the Statistical Properties of a Test for Random Forest Variable Importance,’ Technical Report Number 017, Department of Statistics, University of Munich.
[74] Strobl, C., A-L Boulesteix, T. Augustin and A. Zeileis. 2008. ‘Conditional variable importance for Random Forests,’ BMC Bioinformatics, 9: 307.
[75] Stone, C. 1977. ‘Consistent nonparametric regression,’ The Annals of Statistics, 5: 595–645.
[76] Su, X., M. Wang, and J. Fan. 2004. ‘Maximum Likelihood Regression Trees.’ Journal of Computational and Graphical Statistics 13: 586–98.
[77] Therneau, T.M and B. Atkinson, ‘rpart: Recursive Partitioning’ Recursive partitioning and regression trees Version 3.1-38 (CART for R).
[78] Traskin, M. ‘Random Forests: classification, variable selection and consistency,’ STAT900 Slides, University of Pennsylvania, Nov. 26, 2007.
[79] Wang, T. MATLAB R13. Available at: http://lib.stat.cmu.edu/matlab/
[80] Ward, M., S. Pajevic, J. Dreyfuss, and J. Malley. 2006. ‘Short-term prediction of mortality in patients with systemic lupus erythematosus: classification of outcomes using Random Forests,’ Arthritis and Rheumatism 55: 74–80.