The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 7, Number 3 (2013), 1612-1639.
Influencing elections with statistics: Targeting voters with logistic regression trees
In political campaigning substantial resources are spent on voter mobilization, that is, on identifying and influencing as many people as possible to vote. Campaigns use statistical tools for deciding whom to target (“microtargeting”). In this paper we describe a nonpartisan campaign that aims at increasing overall turnout using the example of the 2004 US presidential election. Based on a real data set of 19,634 eligible voters from Ohio, we introduce a modern statistical framework well suited for carrying out the main tasks of voter targeting in a single sweep: predicting an individual’s turnout (or support) likelihood for a particular cause, party or candidate as well as data-driven voter segmentation. Our framework, which we refer to as LORET (for LOgistic REgression Trees), contains standard methods such as logistic regression and classification trees as special cases and allows for a synthesis of both techniques. For our case study, we explore various LORET models with different regressors in the logistic model components and different partitioning variables in the tree components; we analyze them in terms of their predictive accuracy and compare the effect of using the full set of available variables against using only a limited amount of information. We find that augmenting a standard set of variables (such as age and voting history) with additional predictor variables (such as the household composition in terms of party affiliation) clearly improves predictive accuracy. We also find that LORET models based on tree induction beat the unpartitioned models. Furthermore, we illustrate how voter segmentation arises from our framework and discuss the resulting profiles from a targeting point of view.
Ann. Appl. Stat., Volume 7, Number 3 (2013), 1612-1639.
First available in Project Euclid: 3 October 2013
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Rusch, Thomas; Lee, Ilro; Hornik, Kurt; Jank, Wolfgang; Zeileis, Achim. Influencing elections with statistics: Targeting voters with logistic regression trees. Ann. Appl. Stat. 7 (2013), no. 3, 1612--1639. doi:10.1214/13-AOAS648. https://projecteuclid.org/euclid.aoas/1380804809
- Supplementary material A: Data and Code. A bundle containing the code used to produce the results of the paper and a snapshot of the data set. Unfortunately we are not at liberty to share the whole original data set, but were allowed to include an anonymized, random sample ($N=6544$) of the data.
- Supplementary material B: Rejoinder. A rejoinder containing additional analyses of LORET models with a historic proxy variable and a comparison of LORET models to high-performance methods like Support Vector Machines, Bayesian Additive Regression Trees, Artificial Neural Networks, Logistic Model Trees and Random Forests.