Statistical Science

Functional Data Analysis in Electronic Commerce Research

Wolfgang Jank and Galit Shmueli
Source: Statist. Sci. Volume 21, Number 2 (2006), 155-166.

Abstract

This paper describes opportunities and challenges of using functional data analysis (FDA) for the exploration and analysis of data originating from electronic commerce (eCommerce). We discuss the special data structures that arise in the online environment and why FDA is a natural approach for representing and analyzing such data. The paper reviews several FDA methods and motivates their usefulness in eCommerce research by providing a glimpse into new domain insights that they allow. We argue that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask (and subsequently answer) new research questions that classical statistical methods are not able to address, and also by expanding on research questions beyond the ones traditionally asked in the offline environment. We describe several applications originating from online transactions which are new to the statistics literature, and point out statistical challenges accompanied by some solutions. We also discuss some promising future directions for joint research efforts between researchers in eCommerce and statistics.

First Page: Show Hide
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ss/1154979818
Digital Object Identifier: doi:10.1214/088342306000000132
Mathematical Reviews number (MathSciNet): MR2324075
Zentralblatt MATH identifier: 05191857

References

Abraham, C., Cornillon, P. A., Matzner-Løber, E. and Molinari, N. (2003). Unsupervised curve-clustering using B-splines. Scand. J. Statist. 30 581--595.
Mathematical Reviews (MathSciNet): MR2002229
Digital Object Identifier: doi:10.1111/1467-9469.00350
Zentralblatt MATH: 1039.91067
Alford, B. and Urimi, L. (2004). An analysis of various spline smoothing techniques for online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Aris, A., Shneiderman, B., Plaisant, C., Shmueli, G. and Jank, W. (2005). Representing unevenly-spaced time series data for visualization and interactive exploration. Human--Computer Interaction---INTERACT 2005: IFIP TC13 International Conference. Lecture Notes in Comput. Sci. 3585 835--846. Springer, Berlin.
Bajari, P. and Hortaçsu, A. (2003). The winner's curse, reserve prices and endogenous entry: Empirical insights from eBay auctions. RAND J. Economics 34 329--355.
Bajari, P. and Hortaçsu, A. (2004). Economic insights from Internet auctions. J. Economic Literature 42 457--486.
Bapna, R., Jank, W. and Shmueli, G. (2004). Price formation and its dynamics in online auctions. Working paper RHS-06-003, Smith School of Business, Univ. Maryland. Available atssrn.com/abstract=902887.
Bock, H. H. and Diday, E., eds. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, Heidelberg.
Mathematical Reviews (MathSciNet): MR1792132
Cuevas, A., Febrero, M. and Fraiman, R. (2002). Linear functional regression: The case of fixed design and functional response. Canad. J. Statist. 30 285--300.
Mathematical Reviews (MathSciNet): MR1926066
Digital Object Identifier: doi:10.2307/3315952
Dellarocas, C. and Narayan, R. (2006). A statistical measure of a population's propensity to engage in post-purchase online word-of-mouth. Statist. Sci. 21 277--285.
Mathematical Reviews (MathSciNet): MR2324086
Digital Object Identifier: doi:10.1214/088342306000000169
Project Euclid: euclid.ss/1154979827
Zentralblatt MATH: 05191866
Fan, J. and Lin, S.-K. (1998). Test of significance when data are curves. J. Amer. Statist. Assoc. 93 1007--1021.
Mathematical Reviews (MathSciNet): MR1649196
Digital Object Identifier: doi:10.2307/2669845
Zentralblatt MATH: 1064.62525
Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics 39 254--261.
Mathematical Reviews (MathSciNet): MR1462586
Digital Object Identifier: doi:10.2307/1271130
Zentralblatt MATH: 0891.62027
Guo, W. (2002). Inference in smoothing spline analysis of variance. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 887--898.
Mathematical Reviews (MathSciNet): MR1979393
Digital Object Identifier: doi:10.1111/1467-9868.00367
Zentralblatt MATH: 1067.62070
Hall, P., Poskitt, D. S. and Presnell, B. (2001). A functional data-analytic approach to signal discrimination. Technometrics 43 1--9.
Mathematical Reviews (MathSciNet): MR1847775
Digital Object Identifier: doi:10.1198/00401700152404273
Zentralblatt MATH: 1072.62686
Hyde, V., Jank, W. and Shmueli, G. (2006). Investigating concurrency in online auctions through visualization. Amer. Statist. To appear.
Mathematical Reviews (MathSciNet): MR2246757
Digital Object Identifier: doi:10.1198/000313006X124163
Hyde, V., Moore, E. and Hodge, A. (2004). Functional PCA for exploring bidding activity times for online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
James, G. M. (2002). Generalized linear models with functional predictors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 411--432.
Mathematical Reviews (MathSciNet): MR1924298
Digital Object Identifier: doi:10.1111/1467-9868.00342
Zentralblatt MATH: 1090.62070
James, G. M. and Hastie, T. J. (2001). Functional linear discriminant analysis for irregularly sampled curves. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 533--550.
Mathematical Reviews (MathSciNet): MR1858401
Digital Object Identifier: doi:10.1111/1467-9868.00297
Zentralblatt MATH: 0989.62036
James, G. M., Hastie, T. J. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587--602.
Mathematical Reviews (MathSciNet): MR1789811
Zentralblatt MATH: 0962.62056
Digital Object Identifier: doi:10.1093/biomet/87.3.587
James, G. M. and Sugar, C. A. (2003). Clustering for sparsely sampled functional data. J. Amer. Statist. Assoc. 98 397--408.
Mathematical Reviews (MathSciNet): MR1995716
Digital Object Identifier: doi:10.1198/016214503000189
Zentralblatt MATH: 1041.62052
Jank, W. and Shmueli, G. (2005). Profiling price dynamics in online auctions using curve clustering. Working paper RHS-06-004, Smith School of Business, Univ. Maryland. Available at ssrn.com/abstract=902893.
Jank, W. and Shmueli, G. (2006). Modeling concurrency of events in online auctions via spatio-temporal semiparametric models. Working paper, Smith School of Business, Univ. Maryland.
Lucking-Reiley, D., Bryan, D., Prasad, N. and Reeves, D. (2000). Pennies from eBay: The determinants of price in online auctions. Technical report, Dept. Economics, Univ. Arizona.
Ogden, R. T., Miller, C. E., Takezawa, K. and Ninomiya, S. (2002). Functional regression in crop lodging assessment with digital images. J. Agric. Biol. Environ. Stat. 7 389--402.
Pfeiffer, R. M., Bura, E., Smith, A. and Rutter, J. L. (2002). Two approaches to mutation detection based on functional data. Stat. Med. 21 3447--3464.
Ramsay, J. O. (1998). Estimating smooth monotone functions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 365--375.
Mathematical Reviews (MathSciNet): MR1616049
Digital Object Identifier: doi:10.1111/1467-9868.00130
Zentralblatt MATH: 0909.62041
Ramsay, J. O. (2000a). Differential equation models for statistical functions. Canad. J. Statist. 28 225--240.
Mathematical Reviews (MathSciNet): MR1777224
Digital Object Identifier: doi:10.2307/3315975
Ramsay, J. O. (2000b). Functional components of variation in handwriting. J. Amer. Statist. Assoc. 95 9--15.
Ramsay, J. O. and Ramsey, J. B. (2002). Functional data analysis of the dynamics of the monthly index of nondurable goods production. J. Econometrics 107 327--344.
Mathematical Reviews (MathSciNet): MR1889966
Digital Object Identifier: doi:10.1016/S0304-4076(01)00127-0
Zentralblatt MATH: 1051.62118
Ramsay, J. O. and Silverman, B. W. (1997). Functional Data Analysis. Springer, New York.
Mathematical Reviews (MathSciNet): MR2168993
Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis: Methods and Case Studies. Springer, New York.
Mathematical Reviews (MathSciNet): MR1910407
Zentralblatt MATH: 1011.62002
Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR2168993
Ratcliffe, S. J., Heller, G. Z. and Leader, L. R. (2002). Functional data analysis with application to periodically stimulated foetal heart rate data. II: Functional logistic regression. Stat. Med. 21 1115--1127.
Ratcliffe, S. J., Leader, L. R. and Heller, G. Z. (2002). Functional data analysis with application to periodically stimulated foetal heart rate data. I: Functional regression. Stat. Med. 21 1103--1114.
Reddy, S. K. and Dass, M. (2006). Modeling on-line art auction dynamics using functional data analysis. Statist. Sci. 21 179--193.
Mathematical Reviews (MathSciNet): MR2324077
Digital Object Identifier: doi:10.1214/088342306000000196
Project Euclid: euclid.ss/1154979820
Zentralblatt MATH: 05191859
Rossi, N., Wang, X. and Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. J. Educational and Behavioral Statistics 27 291--317.
Shmueli, G. and Jank, W. (2005). Visualizing online auctions. J. Comput. Graph. Statist. 14 299--319.
Mathematical Reviews (MathSciNet): MR2160815
Digital Object Identifier: doi:10.1198/106186005X48236
Shmueli, G. and Jank, W. (2006). Modeling the dynamics of online auctions: A modern statistical approach. In Economics, Information Systems and E-Commerce Research II: Advanced Empirical Methods 1 (R. Kauffman and P. Tallon, eds.). Sharpe, Armonk, NY. To appear.
Shmueli, G., Jank, W., Aris, A., Plaisant, C. and Shneiderman, B. (2006). Exploring auction databases through interactive visualization. Decision Support Systems. To appear.
Stewart, K., Darcy, D. and Daniel, S. (2006). Opportunities and challenges applying functional data analysis to the study of open source software evolution. Statist. Sci. 21 167--178.
Mathematical Reviews (MathSciNet): MR2324076
Digital Object Identifier: doi:10.1214/088342306000000141
Project Euclid: euclid.ss/1154979819
Zentralblatt MATH: 05191858
Tarpey, T. and Kinateder, K. K. J. (2003). Clustering functional data. J. Classification 20 93--114.
Mathematical Reviews (MathSciNet): MR1983123
Digital Object Identifier: doi:10.1007/s00357-003-0007-3
Zentralblatt MATH: 1112.62327
Wang, S. (2005). Principal differential analysis of online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Wang, S., Jank, W. and Shmueli, G. (2006). Forecasting eBay's online auction prices using functional data analysis. J. Bus. Econom. Statist. To appear.
Mathematical Reviews (MathSciNet): MR2420144
Wang, S. and Wu, O. (2004). Bivariate functional modelling of the bid amounts and number of bids in online auctions. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Wu, O. (2005). Dynamics of online movie ratings. Term paper, Research Interaction Team, VIGRE program, Univ. Maryland.
Yushkevich, P., Pizer, S., Joshi, S. and Marron, J. S. (2001). Intuitive, localized analysis of shape variability. Information Processing in Medical Imaging. Lecture Notes in Comput. Sci. 2082 402--408. Springer, Berlin.

2012 © Institute of Mathematical Statistics

Statistical Science

Statistical Science