Statistical Science

In Search of the Black Swan: Analysis of the Statistical Evidence of Electoral Fraud in Venezuela

Ricardo Hausmann and Roberto Rigobon

Full-text: Open access


This study analyzes diverse hypotheses of electronic fraud in the Recall Referendum celebrated in Venezuela on August 15, 2004. We define fraud as the difference between the elector’s intent, and the official vote tally. Our null hypothesis is that there was no fraud, and we attempt to search for evidence that will allow us to reject this hypothesis. We find no evidence that fraud was committed by applying numerical maximums to machines in some precincts. Equally, we discard any hypothesis that implies altering some machines and not others, at each electoral precinct, because the variation patterns between machines at each precinct are normal. However, the statistical evidence is compatible with the occurrence of fraud that has affected every machine in a single precinct, but differentially more in some precincts than others. We find that the deviation pattern between precincts, based on the relationship between the signatures collected to request the referendum in November 2003 (the so-called, Reafirmazo), and the YES votes on August 15, is positive and significantly correlated with the deviation pattern in the relationship between exit polls and votes in those same precincts. In other words, those precincts in which, according to the number of signatures, there are an unusually low number of YES votes (i.e., votes to impeach the president), is also where, according to the exit polls, the same thing occurs. Using statistical techniques, we discard the fact that this is due to spurious errors in the data or to random coefficients in such relationships. We interpret that it is because both the signatures and the exit polls are imperfect measurements of the elector’s intent but not of the possible fraud, and therefore what causes its correlation is precisely the presence of fraud. Moreover, we find that the sample used in the audit conducted on August 18 was neither random nor representative of the entire universe of precincts. In this sample, the Reafirmazo signatures are associated with 10 percent more votes than in the non-audited precincts. We built 1,000 random samples in non-audited precincts and found that this result occurs with a frequency lower than 1 percent. This result is compatible with the hypothesis that the sample for the audit was chosen only among those precincts whose results had not been altered.

Article information

Statist. Sci. Volume 26, Number 4 (2011), 543-563.

First available in Project Euclid: 28 February 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Electronic voting instrumental variables identification


Hausmann, Ricardo; Rigobon, Roberto. In Search of the Black Swan: Analysis of the Statistical Evidence of Electoral Fraud in Venezuela. Statist. Sci. 26 (2011), no. 4, 543--563. doi:10.1214/11-STS373.

Export citation


  • [1] Carter Center (2005). The Venezuela Presidential Recall Referendum: Final Reports. Available at
  • [2] Felten, E., Rubin, A. and Stubblefield, A. (2004). Analysis of voting data from the recent Venezuela referendum. Available at
  • [3] Fisher, F. M. (1976). The Identification Problem in Econometrics, 2nd ed. Krieger, New York.
  • [4] Green, W. (2008). Econometric Analysis, 6th ed. Prentice Hall, NJ.
  • [5] Hausman, J. (1983). Specification and estimation of simultaneous equation models. In Handbook of Econometrics (Z. Griliches and M. Intriligator, eds.). Elsevier, New York.
  • [6] Rigobon, R. (2003). Identification through heteroskedasticity. Review of Economics and Statistics 85, 4.
  • [7] Taylor, J. (2005). Analysis of the Venezuelan referendum counts. Available at
  • [8] Wright, P. G. (1928). The Tariff on Animal and Vegetable Oils. The Institute of Economics. Macmillan, New York.