This paper describes opportunities and challenges of using functional data analysis (FDA) for the exploration and analysis of data originating from electronic commerce (eCommerce). We discuss the special data structures that arise in the online environment and why FDA is a natural approach for representing and analyzing such data. The paper reviews several FDA methods and motivates their usefulness in eCommerce research by providing a glimpse into new domain insights that they allow. We argue that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask (and subsequently answer) new research questions that classical statistical methods are not able to address, and also by expanding on research questions beyond the ones traditionally asked in the offline environment. We describe several applications originating from online transactions which are new to the statistics literature, and point out statistical challenges accompanied by some solutions. We also discuss some promising future directions for joint research efforts between researchers in eCommerce and statistics.
References
Abraham, C., Cornillon, P. A., Matzner-Løber, E. and Molinari, N. (2003). Unsupervised curve-clustering using B-splines. Scand. J. Statist. 30 581--595.
Alford, B. and Urimi, L. (2004). An analysis of various spline smoothing techniques for online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Aris, A., Shneiderman, B., Plaisant, C., Shmueli, G. and Jank, W. (2005). Representing unevenly-spaced time series data for visualization and interactive exploration. Human--Computer Interaction---INTERACT 2005: IFIP TC13 International Conference. Lecture Notes in Comput. Sci. 3585 835--846. Springer, Berlin.
Bajari, P. and Hortaçsu, A. (2003). The winner's curse, reserve prices and endogenous entry: Empirical insights from eBay auctions. RAND J. Economics 34 329--355.
Bajari, P. and Hortaçsu, A. (2004). Economic insights from Internet auctions. J. Economic Literature 42 457--486.
Bapna, R., Jank, W. and Shmueli, G. (2004). Price formation and its dynamics in online auctions. Working paper RHS-06-003, Smith School of Business, Univ. Maryland. Available atssrn.com/abstract=902887.
Bock, H. H. and Diday, E., eds. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, Heidelberg.
Cuevas, A., Febrero, M. and Fraiman, R. (2002). Linear functional regression: The case of fixed design and functional response. Canad. J. Statist. 30 285--300.
Dellarocas, C. and Narayan, R. (2006). A statistical measure of a population's propensity to engage in post-purchase online word-of-mouth. Statist. Sci. 21 277--285.
Fan, J. and Lin, S.-K. (1998). Test of significance when data are curves. J. Amer. Statist. Assoc. 93 1007--1021.
Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics 39 254--261.
Guo, W. (2002). Inference in smoothing spline analysis of variance. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 887--898.
Hall, P., Poskitt, D. S. and Presnell, B. (2001). A functional data-analytic approach to signal discrimination. Technometrics 43 1--9.
Hyde, V., Jank, W. and Shmueli, G. (2006). Investigating concurrency in online auctions through visualization. Amer. Statist. To appear.
Hyde, V., Moore, E. and Hodge, A. (2004). Functional PCA for exploring bidding activity times for online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
James, G. M. (2002). Generalized linear models with functional predictors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 411--432.
James, G. M. and Hastie, T. J. (2001). Functional linear discriminant analysis for irregularly sampled curves. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 533--550.
James, G. M., Hastie, T. J. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587--602.
James, G. M. and Sugar, C. A. (2003). Clustering for sparsely sampled functional data. J. Amer. Statist. Assoc. 98 397--408.
Jank, W. and Shmueli, G. (2005). Profiling price dynamics in online auctions using curve clustering. Working paper RHS-06-004, Smith School of Business, Univ. Maryland. Available at ssrn.com/abstract=902893.
Jank, W. and Shmueli, G. (2006). Modeling concurrency of events in online auctions via spatio-temporal semiparametric models. Working paper, Smith School of Business, Univ. Maryland.
Lucking-Reiley, D., Bryan, D., Prasad, N. and Reeves, D. (2000). Pennies from eBay: The determinants of price in online auctions. Technical report, Dept. Economics, Univ. Arizona.
Ogden, R. T., Miller, C. E., Takezawa, K. and Ninomiya, S. (2002). Functional regression in crop lodging assessment with digital images. J. Agric. Biol. Environ. Stat. 7 389--402.
Pfeiffer, R. M., Bura, E., Smith, A. and Rutter, J. L. (2002). Two approaches to mutation detection based on functional data. Stat. Med. 21 3447--3464.
Ramsay, J. O. (1998). Estimating smooth monotone functions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 365--375.
Ramsay, J. O. (2000a). Differential equation models for statistical functions. Canad. J. Statist. 28 225--240.
Ramsay, J. O. (2000b). Functional components of variation in handwriting. J. Amer. Statist. Assoc. 95 9--15.
Ramsay, J. O. and Ramsey, J. B. (2002). Functional data analysis of the dynamics of the monthly index of nondurable goods production. J. Econometrics 107 327--344.
Ramsay, J. O. and Silverman, B. W. (1997). Functional Data Analysis. Springer, New York.
Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis: Methods and Case Studies. Springer, New York.
Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
Ratcliffe, S. J., Heller, G. Z. and Leader, L. R. (2002). Functional data analysis with application to periodically stimulated foetal heart rate data. II: Functional logistic regression. Stat. Med. 21 1115--1127.
Ratcliffe, S. J., Leader, L. R. and Heller, G. Z. (2002). Functional data analysis with application to periodically stimulated foetal heart rate data. I: Functional regression. Stat. Med. 21 1103--1114.
Reddy, S. K. and Dass, M. (2006). Modeling on-line art auction dynamics using functional data analysis. Statist. Sci. 21 179--193.
Rossi, N., Wang, X. and Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. J. Educational and Behavioral Statistics 27 291--317.
Shmueli, G. and Jank, W. (2005). Visualizing online auctions. J. Comput. Graph. Statist. 14 299--319.
Shmueli, G. and Jank, W. (2006). Modeling the dynamics of online auctions: A modern statistical approach. In Economics, Information Systems and E-Commerce Research II: Advanced Empirical Methods 1 (R. Kauffman and P. Tallon, eds.). Sharpe, Armonk, NY. To appear.
Shmueli, G., Jank, W., Aris, A., Plaisant, C. and Shneiderman, B. (2006). Exploring auction databases through interactive visualization. Decision Support Systems. To appear.
Stewart, K., Darcy, D. and Daniel, S. (2006). Opportunities and challenges applying functional data analysis to the study of open source software evolution. Statist. Sci. 21 167--178.
Tarpey, T. and Kinateder, K. K. J. (2003). Clustering functional data. J. Classification 20 93--114.
Wang, S. (2005). Principal differential analysis of online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Wang, S., Jank, W. and Shmueli, G. (2006). Forecasting eBay's online auction prices using functional data analysis. J. Bus. Econom. Statist. To appear.
Wang, S. and Wu, O. (2004). Bivariate functional modelling of the bid amounts and number of bids in online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Wu, O. (2005). Dynamics of online movie ratings. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Yushkevich, P., Pizer, S., Joshi, S. and Marron, J. S. (2001). Intuitive, localized analysis of shape variability. Information Processing in Medical Imaging. Lecture Notes in Comput. Sci. 2082 402--408. Springer, Berlin.