The Annals of Statistics

Empirical dynamics for longitudinal data

Hans-Georg Müller and Fang Yao

Full-text: Open access

Abstract

We demonstrate that the processes underlying on-line auction price bids and many other longitudinal data can be represented by an empirical first order stochastic ordinary differential equation with time-varying coefficients and a smooth drift process. This equation may be empirically obtained from longitudinal observations for a sample of subjects and does not presuppose specific knowledge of the underlying processes. For the nonparametric estimation of the components of the differential equation, it suffices to have available sparsely observed longitudinal measurements which may be noisy and are generated by underlying smooth random trajectories for each subject or experimental unit in the sample. The drift process that drives the equation determines how closely individual process trajectories follow a deterministic approximation of the differential equation. We provide estimates for trajectories and especially the variance function of the drift process. At each fixed time point, the proposed empirical dynamic model implies a decomposition of the derivative of the process underlying the longitudinal data into a component explained by a linear component determined by a varying coefficient function dynamic equation and an orthogonal complement that corresponds to the drift process. An enhanced perturbation result enables us to obtain improved asymptotic convergence rates for eigenfunction derivative estimation and consistency for the varying coefficient function and the components of the drift process. We illustrate the differential equation with an application to the dynamics of on-line auction data.

Article information

Source
Ann. Statist., Volume 38, Number 6 (2010), 3458-3486.

Dates
First available in Project Euclid: 30 November 2010

Permanent link to this document
https://projecteuclid.org/euclid.aos/1291126964

Digital Object Identifier
doi:10.1214/09-AOS786

Mathematical Reviews number (MathSciNet)
MR2766859

Zentralblatt MATH identifier
1233.62069

Subjects
Primary: 62G05: Estimation 62G20: Asymptotic properties

Keywords
Functional data analysis longitudinal data stochastic differential equation Gaussian process

Citation

Müller, Hans-Georg; Yao, Fang. Empirical dynamics for longitudinal data. Ann. Statist. 38 (2010), no. 6, 3458--3486. doi:10.1214/09-AOS786. https://projecteuclid.org/euclid.aos/1291126964


Export citation

References

  • Ash, R. B. and Gardner, M. F. (1975). Topics in Stochastic Processes. Probability and Mathematical Statistics 27. Academic Press, New York.
  • Bapna, R., Jank, W. and Shmueli, G. (2008). Price formation and its dynamics in online auctions. Decis. Support Syst. 44 641–656.
  • Bosq, D. (2000). Linear Processes in Function Spaces: Theory and Applications. Springer, New York.
  • Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 12 136–154.
  • Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and its Applications. Chapman and Hall, London.
  • Fine, J. (1987). On the validity of the perturbation method in asymptotic theory. Statistics 18 401–414.
  • Gasser, T. and Müller, H.-G. (1984). Estimating regression functions and their derivatives by the kernel method. Scand. J. Statist. 11 171–185.
  • Gasser, T., Müller, H.-G., Köhler, W., Molinari, L. and Prader, A. (1984). Nonparametric regression analysis of growth curves. Ann. Statist. 12 210–229.
  • Gervini, D. and Gasser, T. (2005). Nonparametric maximum likelihood estimation of the structural mean of a sample of curves. Biometrika 92 801–820.
  • Grenander, U. (1950). Stochastic processes and statistical inference. Ark. Mat. 1 195–277.
  • Hall, P. (1984). Integrated square error properties of kernel estimators of regression functions. Ann. Statist. 12 241–260.
  • Hall, P., Müller, H.-G. and Wang, J.-L. (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist. 34 1493–1517.
  • Härdle, W. and Gasser, T. (1985). On robust kernel estimation of derivatives of regression functions. Scand. J. Statist. 12 233–240.
  • Jank, W. and Shmueli, G. (2005). Profiling price dynamics in online auctions using curve clustering. SSRN eLibrary. Working Paper RHS-06-004, Smith School of Business, Univ. Maryland.
  • Jank, W. and Shmueli, G. (2006). Functional data analysis in electronic commerce research. Statist. Sci. 21 155–166.
  • Jones, M. C. and Rice, J. A. (1992). Displaying the important features of large collections of similar curves. Amer. Statist. 46 140–145.
  • Kato, T. (1995). Perturbation Theory for Linear Operators. Springer, Berlin.
  • Kirkpatrick, M. and Heckman, N. (1989). A quantitative genetic model for growth, shape, reaction norms, and other infinite-dimensional characters. J. Math. Biol. 27 429–450.
  • Liu, B. and Müller, H.-G. (2008). Functional data analysis for sparse auction data. In Statistical Methods in eCommerce Research (W. Jank and G. Shmueli, eds.) 269–290. Wiley, New York.
  • Liu, B. and Müller, H.-G. (2009). Estimating derivatives for samples of sparsely observed functions, with application to on-line auction dynamics. J. Amer. Statist. Assoc. 104 704–714.
  • Mas, A. and Menneteau, L. (2003). Perturbation approach applied to the asymptotic study of random operators. In High Dimensional Probability, III (Sandjberg, 2002). Progress in Probability 55 127–134. Birkhäuser, Basel.
  • Mas, A. and Pumo, B. (2007). The ARHD model. J. Statist. Plann. Inference 137 538–553.
  • Mas, A. and Pumo, B. (2009). Functional linear regression with derivatives. J. Nonparametr. Stat. 21 19–40.
  • Ramsay, J. (2000). Differential equation models for statistical functions. Canad. J. Statist. 28 225–240.
  • Ramsay, J. O., Hooker, G., Campbell, D. and Cao, J. (2007). Parameter estimation for differential equations: A generalized smoothing approach (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 69 741–796.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
  • Reddy, S. K. and Dass, M. (2006). Modeling on-line art auction dynamics using functional data analysis. Statist. Sci. 21 179–193.
  • Reithinger, F., Jank, W., Tutz, G. and Shmueli, G. (2008). Modelling price paths in on-line auctions: Smoothing sparse and unevenly sampled curves by using semiparametric mixed models. J. Roy. Statist. Soc. Ser. C 57 127–148.
  • Rice, J. A. (2004). Functional and longitudinal data analysis: Perspectives on smoothing. Statist. Sinica 631–647.
  • Rice, J. A. and Wu, C. O. (2001). Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57 253–259.
  • Shi, M., Weiss, R. E. and Taylor, J. M. G. (1996). An analysis of paediatric CD4 counts for Acquired Immune Deficiency Syndrome using flexible random curves. J. Roy. Statist. Soc. Ser. C 45 151–163.
  • Staniswalis, J. G. and Lee, J. J. (1998). Nonparametric regression analysis of longitudinal data. J. Amer. Statist. Assoc. 93 1403–1418.
  • Sy, J. P., Taylor, J. M. G. and Cumberland, W. G. (1997). A stochastic model for the analysis of bivariate longitudinal AIDS data. Biometrics 53 542–555.
  • Wang, N., Carroll, R. J. and Lin, X. (2005). Efficient semiparametric marginal estimation for longitudinal/clustered data. J. Amer. Statist. Assoc. 100 147–157.
  • Wang, L., Li, H. and Huang, J. Z. (2008). Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J. Amer. Statist. Assoc. 103 1556–1569.
  • Wang, S., Jank, W., Shmueli, G. and Smith, P. (2008). Modeling price dynamics in ebay auctions using principal differential analysis. J. Amer. Statist. Assoc. 103 1100–1118.
  • Yao, F. and Lee, T. C. M. (2006). Penalized spline models for functional principal component analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 3–25.
  • Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.
  • Zhao, X., Marron, J. S. and Wells, M. T. (2004). The functional data analysis view of longitudinal data. Statist. Sinica 14 789–808.
  • Ziemer, W. (1989). Weakly Differentiable Functions: Sobolev Spaces and Functions of Bounded Variation. Springer, New York.