Statistical Science

Network-Based Marketing: Identifying Likely Adopters via Consumer Networks

Shawndra Hill, Foster Provost, and Chris Volinsky

Full-text: Open access


Network-based marketing refers to a collection of marketing techniques that take advantage of links between consumers to increase sales. We concentrate on the consumer networks formed using direct interactions (e.g., communications) between consumers. We survey the diverse literature on such marketing with an emphasis on the statistical methods used and the data to which these methods have been applied. We also provide a discussion of challenges and opportunities for this burgeoning research topic. Our survey highlights a gap in the literature. Because of inadequate data, prior studies have not been able to provide direct, statistical support for the hypothesis that network linkage can directly affect product/service adoption. Using a new data set that represents the adoption of a new telecommunications service, we show very strong support for the hypothesis. Specifically, we show three main results: (1) “Network neighbors”—those consumers linked to a prior customer—adopt the service at a rate 3–5 times greater than baseline groups selected by the best practices of the firm’s marketing team. In addition, analyzing the network allows the firm to acquire new customers who otherwise would have fallen through the cracks, because they would not have been identified based on traditional attributes. (2) Statistical models, built with a very large amount of geographic, demographic and prior purchase data, are significantly and substantially improved by including network information. (3) More detailed network information allows the ranking of the network neighbors so as to permit the selection of small sets of individuals with very high probabilities of adoption.

Article information

Statist. Sci. Volume 21, Number 2 (2006), 256-276.

First available in Project Euclid: 7 August 2006

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier


Hill, Shawndra; Provost, Foster; Volinsky, Chris. Network-Based Marketing: Identifying Likely Adopters via Consumer Networks. Statist. Sci. 21 (2006), no. 2, 256--276. doi:10.1214/088342306000000222.

Export citation


  • Adomavicius, G. and Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowledge and Data Engineering 17 734--749.
  • Agarwal, D. and Pregibon, D. (2004). Enhancing communities of interest using Bayesian stochastic blockmodels. In Proc. Fourth SIAM International Conference on Data Mining. SIAM, Philadelphia.
  • Bass, F. M. (1969). A new product growth for model consumer durables. Management Sci. 15 215--227.
  • Blau, P. M. (1977). Inequality and Heterogeneity: A Primitive Theory of Social Structure. Free Press, New York.
  • Bowman, D. and Narayandas, D. (2001). Managing customer-initiated contacts with manufacturers: The impact on share of category requirements and word-of-mouth behavior. J. Marketing Research 38 281--297.
  • Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30 107--117.
  • Case, A. C. (1991). Spatial patterns in household demand. Econometrica 59 953--965.
  • Chan, E. and Stolfo, S. (1998). Toward scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In Proc. Fourth International Conference on Knowledge Discovery and Data Mining 164--168. AAAI Press, Menlo Park, CA.
  • Clearwater, S. H. and Stern, E. G. (1991). A rule-learning program in high-energy physics event classification. Computer Physics Communications 67 159--182.
  • Dellarocas, C. (2003). The digitization of word of mouth: Promise and challenges of online feedback mechanisms. Management Sci. 49 1407--1424.
  • Domingos, P. and Richardson, M. (2001). Mining the network value of customers. In Proc. Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 57--66. ACM Press, New York.
  • DuMouchel, W., Volinsky, C., Johnson, T., Cortes, C. and Pregibon, D. (1999). Squashing flat files flatter. In Proc. Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 6--15. ACM Press, New York.
  • Fichman, R. G. (2004). Going beyond the dominant paradigm for information technology innovation research: Emerging concepts and methods. J. Assoc. Information Systems 5 314--355.
  • Fildes, R. (2003). Review of New-Product Diffusion Models, by V. Mahajan, E. Muller and Y. Wind, eds. Internat. J. Forecasting 19 327--328.
  • Frenzen, J. and Nakamoto, K. (1993). Structure, cooperation, and the flow of market information. J. Consumer Research 20 360--375.
  • Getoor, L. (2005). Tutorial on statistical relational learning. Inductive Logic Programming, 15th International Conference. Lecture Notes in Comput. Sci. 3625 415. Springer, Berlin.
  • Getoor, L., Friedman, N., Koller, D. and Pfeffer, A. (2001). Learning probabilistic relational models. In Relational Data Mining (S. Džeroski and N. Lavrač, eds.) 307--338. Springer, Berlin.
  • Getoor, L., Friedman, N., Koller, D. and Taskar, B. (2003). Learning probabilistic models of link structure. J. Mach. Learn. Res. 3 679--707.
  • Getoor, L. and Sahami, M. (1999). Using probabilistic relation models for collaborative filtering. In Proc. WEBKDD 1999, San Diego, CA.
  • Gladwell, M. (1997). The coolhunt. The New Yorker March 17, 78--88.
  • Gladwell, M. (2002). The Tipping Point: How Little Things Can Make a Big Difference. Back Bay Books, Boston.
  • Hightower, R., Brady, M. K. and Baker, T. L. (2002). Investigating the role of the physical environment in hedonic service consumption: An exploratory study of sporting events. J. Business Research 55 697--707.
  • Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090--1098.
  • Huang, Z., Chung, W. and Chen, H. C. (2004). A graph model for E-commerce recommender systems. J. Amer. Soc. Information Science and Technology 55 259--274.
  • Japkowicz, N. and Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis 6 429--449.
  • Joshi, M., Kumar, V. and Agarwal, R. (2001). Evaluating boosting algorithms to classify rare classes: Comparison and improvements. In Proc. IEEE International Conference on Data Mining 257--264. IEEE Press, Piscataway, NJ.
  • Kautz, H., Selman, B. and Shah, M. (1997). Referral web: Combining social networks and collaborative filtering. Comm. ACM 40(3) 63--65.
  • Kleinberg, J. (1999). Authoritative sources in a hyperlinked environment. J. ACM 46 604--632.
  • Kumar, V. and Krishnan, T. V. (2002). Multinational diffusion models: An alternative framework. Marketing Sci. 21 318--330.
  • Liben-Nowell, D. and Kleinberg, J. (2003). The link prediction problem for social networks. In Proc. Twelfth International Conference on Information and Knowledge Management 556--559. ACM Press, New York.
  • Linden, G., Smith, B. and York, J. (2003). recommendations---Item-to-item collaborative filtering. IEEE Internet Computing 7 76--80.
  • Macskassy, S. and Provost, F. (2004). Classification in networked data: A toolkit and a univariate case study. CeDER Working Paper #CeDER-04-08, Stern School of Business, New York University.
  • Mahajan, V., Muller, E. and Kerin, R. (1984). Introduction strategy for new products with positive and negative word-of-mouth. Management Sci. 30 1389--1404.
  • McCullagh, P. and Nelder, J. A. (1983). Generalized Linear Models. Chapman and Hall, New York.
  • McPherson, M., Smith-Lovin, L. and Cook, J. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology 27 415--444.
  • Mease, D., Wyner, A. and Buja, A. (2006). Boosted classification trees and class probability/quantile estimation. J. Mach. Learn. Res. To appear.
  • Montgomery, A. L. (2001). Applying quantitative marketing techniques to the Internet. Interfaces 31(2) 90--108.
  • Newton, J. and Greiner, R. (2004). Hierarchical probabilistic relational models for collaborative filtering. In Proc. Workshop on Statistical Relational Learning, 21st International Conference on Machine Learning. Banff, Alberta, Canada.
  • Paumgarten, N. (2003). No. 1 fan dept. acknowledged. The New Yorker May 5.
  • Perlich, C. and Provost, F. (2006). Distribution-based aggregation for relational learning with identifier attributes. Machine Learning 62 65--105.
  • Poole, D. (2004). Estimating the size of the telephone universe. A Bayesian Mark-recapture approach. In Proc. Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 659--664. ACM Press, New York.
  • Richardson, M. and Domingos, P. (2002). Mining knowledge-sharing sites for viral marketing. In Proc. Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 61--70. ACM Press, New York.
  • Rogers, E. M. (2003). Diffusion of Innovations, 5th ed. Free Press, New York.
  • Rosenbaum, P. R. and Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. J. Amer. Statist. Assoc. 79 516--524.
  • Tout, K., Evans, D. J. and Yakan, A. (2005). Collaborative filtering: Special case in predictive analysis. Internat. J. Computer Mathematics 82 1--11.
  • Ueda, T. (1990). A study of a competitive Bass model which takes into account competition among firms. J. Operations Research Society of Japan 33 319--334.
  • Van den Bulte, C. and Lilien, G. L. (2001). Medical innovation revisited: Social contagion versus marketing effort. American J. Sociology 106 1409--1435.
  • Walker, R. (2004). The hidden (in plain sight) persuaders. The New York Times Magazine Dec. 5, 69--75.
  • Weiss, G. and Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. J. Artificial Intelligence Research 19 315--354.
  • Weiss, G. M. (2004). Mining with rarity: A unifying framework. ACM SIGKDD Explorations Newsletter 6 7--19.
  • Yang, S. and Allenby, G. M. (2003). Modeling interdependent consumer preferences. J. Marketing Research 40 282--294.