Electronic Journal of Statistics

Partial information framework: Model-based aggregation of estimates from diverse information sources

Ville A. Satopää, Shane T. Jensen, Robin Pemantle, and Lyle H. Ungar

Full-text: Open access

Abstract

Prediction polling is an increasingly popular form of crowdsourcing in which multiple participants estimate the probability or magnitude of some future event. These estimates are then aggregated into a single forecast. Historically, randomness in scientific estimation has been generally assumed to arise from unmeasured factors which are viewed as measurement noise. However, when combining subjective estimates, heterogeneity stemming from differences in the participants’ information is often more important than measurement noise. This paper formalizes information diversity as an alternative source of such heterogeneity and introduces a novel modeling framework that is particularly well-suited for prediction polls. A practical specification of this framework is proposed and applied to the task of aggregating probability and point estimates from two real-world prediction polls. In both cases our model outperforms standard measurement-error-based aggregators, hence providing evidence in favor of information diversity being the more important source of heterogeneity.

Article information

Source
Electron. J. Statist., Volume 11, Number 2 (2017), 3781-3814.

Dates
Received: September 2016
First available in Project Euclid: 18 October 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1508292526

Digital Object Identifier
doi:10.1214/17-EJS1346

Mathematical Reviews number (MathSciNet)
MR3714298

Zentralblatt MATH identifier
06796555

Subjects
Primary: 62A01: Foundations and philosophical topics 62B99: None of the above, but in this section
Secondary: 62P15: Applications to psychology

Keywords
Expert belief forecast heterogeneity judgmental forecasting model averaging noise reduction unsupervised learning

Rights
Creative Commons Attribution 4.0 International License.

Citation

Satopää, Ville A.; Jensen, Shane T.; Pemantle, Robin; Ungar, Lyle H. Partial information framework: Model-based aggregation of estimates from diverse information sources. Electron. J. Statist. 11 (2017), no. 2, 3781--3814. doi:10.1214/17-EJS1346. https://projecteuclid.org/euclid.ejs/1508292526


Export citation

References

  • [1] Ashton, A. H. andAshton, R. H. (1985). Aggregating subjective forecasts: Some empirical results., Management Science 31 1499–1508.
  • [2] Banerjee, A., Guo, X. andWang, H. (2005). On the optimality of conditional expectation as a Bregman predictor., Information Theory, IEEE Transactions on 51 2664–2669.
  • [3] Braun, P. A. andYaniv, I. (1992). A case study of expert judgment: Economists’ probabilities versus base-rate model forecasts., Journal of Behavioral Decision Making 5 217–231.
  • [4] Breiman, L. (1996). Stacked regressions., Machine Learning 24 49–64.
  • [5] Brier, G. W. (1950). Verification of Forecasts Expressed in Terms of Probability., Monthly Weather Review 78 1-3.
  • [6] Broecker, J. (2012)., Forecast verification: a practitioner’s guide in atmospheric science, 2nd ed. 7.2.2, 121-122. John Wiley & Sons, Chichester, UK.
  • [7] Daniels, M. J. andKass, R. E. (2001). Shrinkage estimators for covariance matrices., Biometrics 57 1173–1184.
  • [8] Dawid, A. P. (1982). The well-calibrated Bayesian., Journal of the American Statistical Association 77 605–610.
  • [9] Dawid, A. P., DeGroot, M. H., Mortera, J., Cooke, R., French, S., Genest, C., Schervish, M. J., Lindley, D. V., McConway, K. J. andWinkler, R. L. (1995). Coherent Combination of Experts’ Opinions., TEST 4 263–313.
  • [10] DeGroot, M. H. andMortera, J. (1991). Optimal Linear Opinion Pools., Management Science 37 546–558.
  • [11] Di Bacco, M., Frederic, P. andLad, F. (2003). Learning from the Probability Assertions of Experts. Research Report. Available at:, http://www.math.canterbury.ac.nz/research/ucdms2003n6.pdf.
  • [12] Dowie, J. (1976). On the Efficiency and Equity of Betting Markets., Economica 43 139–150.
  • [13] Flores, B. E. andWhite, E. M. (1989). Subjective versus objective combining of forecasts: an experiment., Journal of Forecasting 8 331–341.
  • [14] Foster, D. P. andVohra, R. V. (1998). Asymptotic calibration., Biometrika 85 379–390.
  • [15] Gigerenzer, G., Hoffrage, U. andKleinbölting, H. (1991). Probabilistic mental models: a Brunswikian theory of confidence., Psychological Review 98 506.
  • [16] Gneiting, T., Ranjan, R. et al. (2013). Combining predictive distributions., Electronic Journal of Statistics 7 1747–1782.
  • [17] Goel, S., Reeves, D. M., Watts, D. J. andPennock, D. M. (2010). Prediction without markets. In, Proceedings of the 11th ACM conference on Electronic commerce 357–366. ACM.
  • [18] Gubin, L., Polyak, B. andRaik, E. (1967). The method of projections for finding the common point of convex sets., USSR Computational Mathematics and Mathematical Physics 7 1–24.
  • [19] Hastings, C., Mosteller, F., Tukey, J. W. andWinsor, C. P. (1947). Low Moments for Small Samples: A Comparative Study of Order Statistics., The Annals of Mathematical Statistics 18 413–426.
  • [20] Hong, L. andPage, S. (2009). Interpreted and Generated Signals., Journal of Economic Theory 144 2174–2196.
  • [21] Hwang, S.-G. (2004). Cauchy’s interlace theorem for eigenvalues of Hermitian matrices., American Mathematical Monthly 111 157–159.
  • [22] Johnson, R. A., Wichern, D. W. et al. (2014)., Applied multivariate statistical analysis 4. Prentice-Hall New Jersey.
  • [23] Juslin, P. (1993). An explanation of the hard-easy effect in studies of realism of confidence in one’s general knowledge., European Journal of Cognitive Psychology 5 55–71.
  • [24] Keren, G. (1987). Facing Uncertainty in the Game of Bridge: A Calibration Study., Organizational Behavior and Human Decision Processes 39 98–114.
  • [25] Ladha, K. K. (1992). The Condorcet jury theorem, free speech, and correlated votes., American Journal of Political Science 617–634.
  • [26] Langford, E., Schwertman, N. andOwens, M. (2001). Is the property of being positively correlated transitive?, The American Statistician 55 322–325.
  • [27] Lichtendahl Jr, K. C. andWinkler, R. L. (2007). Probability elicitation, scoring rules, and competition among forecasters., Management Science 53 1745–1755.
  • [28] Lobo, M. S. andYao, D. (2010). Human judgement is heavy tailed: Empirical evidence and implications for the aggregation of estimates and forecasts. Available at, http://sousalobo.com/researchfiles/Lobo_Yao_MS_11.pdf. (Working paper).
  • [29] McCullagh, P. andNelder, J. A. (1989)., Generalized linear models 37, 2nd ed. CRC press.
  • [30] Mellers, B., Ungar, L., Baron, J., Ramos, J., Gurcay, B., Fincher, K., Scott, S. E., Moore, D., Atanasov, P., Swift, S. A., Murray, T., Stone, E. andTetlock, P. E. (2014). Psychological Strategies for Winning a Geopolitical Forecasting Tournament., Psychological Science 25 1106–1115.
  • [31] Moore, D. A. andKlein, W. M. (2008). Use of absolute and comparative performance feedback in absolute and comparative judgments and decisions., Organizational Behavior and Human Decision Processes 107 60–74.
  • [32] Morris, S. (1995). The common prior assumption in economic theory., Economics & Philosophy 11 227–253.
  • [33] Murphy, A. H. andDaan, H. (1984). Impacts of feedback and experience on the quality of subjective probability forecasts. Comparison of results from the first and second years of the zierikzee experiment., Monthly Weather Review 112 413–423.
  • [34] Murphy, A. H. andWinkler, R. L. (1977a). Can weather forecasters formulate reliable probability forecasts of precipitation and temperature., National Weather Digest 2 2–9.
  • [35] Murphy, A. H. andWinkler, R. L. (1977b). Reliability of Subjective Probability Forecasts of Precipitation and Temperature., Applied Statistics 26 41–47.
  • [36] Murphy, A. H. andWinkler, R. L. (1987). A General Framework for Forecast Verification., Monthly Weather Review 115 1330–1338.
  • [37] Patton, A. J. andTimmermann, A. (2012). Forecast rationality tests based on multi-horizon bounds., Journal of Business & Economic Statistics 30 1–17.
  • [38] Raftery, A. E., Madigan, D. andHoeting, J. A. (1997). Bayesian model averaging for linear regression models., Journal of the American Statistical Association 92 179–191.
  • [39] Ranjan, R. andGneiting, T. (2010). Combining Probability Forecasts., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72 71–91.
  • [40] Ravishanker, N. andDey, D. K. (2001)., A first course in linear model theory. CRC Press.
  • [41] Rowse, G. L., Gustafson, D. H. andLudke, R. L. (1974). Comparison of rules for aggregating subjective likelihood ratios., Organizational Behavior and Human Performance 12 274–285.
  • [42] Satopää, V. A. (2017). Combining Information from Multiple Forecasters: General Inefficiency of the Means., arXiv:1706.06006.
  • [43] Satopää, V. A., Pemantle, R. andUngar, L. H. (2016). Modeling probability forecasts via information diversity., Journal of the American Statistical Association 111 1623–1633.
  • [44] Satopää, V. A., Baron, J., Foster, D. P., Mellers, B. A., Tetlock, P. E. andUngar, L. H. (2014a). Combining Multiple Probability Predictions Using a Simple Logit Model., International Journal of Forecasting 30 344-356.
  • [45] Satopää, V. A., Jensen, S. T., Mellers, B. A., Tetlock, P. E., Ungar, L. H. et al. (2014b). Probability Aggregation in Time-Series: Dynamic Hierarchical Modeling of Sparse Expert Beliefs., The Annals of Applied Statistics 8 1256–1280.
  • [46] Savage, L. J. (1971). Elicitation of personal probabilities and expectations., Journal of the American Statistical Association 66 783–801.
  • [47] Tanaka, M. andNakata, K. (2014). Positive definite matrix approximation with condition number constraint., Optimization Letters 8 939–947.
  • [48] Ungar, L., Mellers, B., Satopää, V., Tetlock, P. andBaron, J. (2012). The Good Judgment Project: A Large Scale Test of Different Methods of Combining Expert Predictions. The Association for the Advancement of Artificial Intelligence Technical Report, FS-12-06.
  • [49] Won, J. H. andKim, S.-J. (2006). Maximum likelihood covariance estimation with a condition number constraint. In, Signals, Systems and Computers, 2006. ACSSC’06. Fortieth Asilomar Conference on 1445–1449. IEEE.