Electronic Journal of Statistics

Asymptotics for $p$-value based threshold estimation in regression settings

Atul Mallik, Moulinath Banerjee, and Bodhisattva Sen

Full-text: Open access

Abstract

We investigate the large sample behavior of a $p$-value based procedure for estimating the threshold level at which a regression function takes off from its baseline value – a problem that frequently arises in environmental statistics, engineering and other related fields. The estimate is constructed via fitting a “stump” function to approximate $p$-values obtained from tests for deviation of the regression function from its baseline level. The smoothness of the regression function in the vicinity of the threshold determines the rate of convergence: a “cusp” of order $k$ at the threshold yields an optimal convergence rate of $n^{-1/{(2k+1)}}$, $n$ being the number of sampled covariates. We show that the asymptotic distribution of the normalized estimate of the threshold, for both i.i.d. and short range dependent errors, is the minimizer of an integrated and transformed Gaussian process. We study the finite sample behavior of confidence intervals obtained through the asymptotic approximation using simulations, consider extensions to short-range dependent data, and apply our inference procedure to two real data sets.

Article information

Source
Electron. J. Statist. Volume 7 (2013), 2477-2515.

Dates
First available in Project Euclid: 8 October 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1381239959

Digital Object Identifier
doi:10.1214/13-EJS845

Mathematical Reviews number (MathSciNet)
MR3117104

Zentralblatt MATH identifier
1294.62106

Subjects
Primary: 62G20: Asymptotic properties 62G86: Nonparametric inference and fuzziness
Secondary: 62G30: Order statistics; empirical distribution functions

Keywords
Baseline value change-point integral of a transformed Gaussian process least squares nonparametric estimation stump function

Citation

Mallik, Atul; Banerjee, Moulinath; Sen, Bodhisattva. Asymptotics for $p$-value based threshold estimation in regression settings. Electron. J. Statist. 7 (2013), 2477--2515. doi:10.1214/13-EJS845. https://projecteuclid.org/euclid.ejs/1381239959.


Export citation

References

  • Adler, R. J. and Taylor, J. E. (2007)., Random fields and geometry. Springer Monographs in Mathematics. Springer, New York.
  • Anevski, D. and Hössjer, O. (2006). A general asymptotic scheme for inference under order restrictions., Ann. Statist. 34 1874–1930.
  • Banerjee, M. (2009). Inference in exponential family regression models under certain shape constraints using inversion based techniques. In, Advances in multivariate statistical methods. Stat. Sci. Interdiscip. Res. 4 249–271. World Sci. Publ., Hackensack, NJ.
  • Banerjee, M. and McKeague, I. W. (2007). Confidence sets for split points in decision trees., Ann. Statist. 35 543–574.
  • Banerjee, M. and Wellner, J. A. (2005). Confidence intervals for current status data., Scand. J. Statist. 32 405–424.
  • Bhattacharya, P. K. (1994). Some aspects of change-point analysis. In, Change-point problems (South Hadley, MA, 1992). IMS Lecture Notes Monogr. Ser. 23 28–56. Inst. Math. Statist., Hayward, CA.
  • Billingsley, P. (1968)., Convergence of probability measures. John Wiley & Sons Inc., New York.
  • Brodsky, B. E. and Darkhovsky, B. S. (1993)., Nonparametric methods in change-point problems. Mathematics and its Applications 243. Kluwer Academic Publishers Group, Dordrecht.
  • Cheng, M.-Y. and Raimondo, M. (2008). Kernel methods for optimal change-points estimation in derivatives., J. Comput. Graph. Statist. 17 56–75.
  • Cramér, H. and Leadbetter, M. R. (1967)., Stationary and related stochastic processes. Sample function properties and their applications. John Wiley & Sons Inc., New York.
  • Csörgő, M. and Horváth, L. (1997)., Limit theorems in change-point analysis. Wiley Series in Probability and Statistics. John Wiley & Sons Ltd., Chichester. With a foreword by David Kendall.
  • Csörgő, S. and Mielniczuk, J. (1995a). Close short-range dependent sums and regression estimation., Acta Sci. Math. (Szeged) 60 177–196.
  • Csörgő, S. and Mielniczuk, J. (1995b). Nonparametric regression under long-range dependent normal errors., Ann. Statist. 23 1000–1014.
  • Dümbgen, L. (1991). The asymptotic behavior of some nonparametric change-point estimators., Ann. Statist. 19 1471–1495.
  • Dümbgen, L. and Spokoiny, V. G. (2001). Multiscale testing of qualitative hypotheses., Ann. Statist. 29 124–152.
  • Dümbgen, L. and Walther, G. (2008). Multiscale inference about a density., Ann. Statist. 36 1758–1785.
  • Ermakov, M. S. (1976). On the asymptotic behavior of statistical estimates for samples having a density with singularities., Theory Probab. Appl. 21 649–651.
  • Ferger, D. (1999). On the uniqueness of maximizers of Markov-Gaussian processes., Statist. Probab. Lett. 45 71–77.
  • Ferger, D. (2004). A continuous mapping theorem for the argmax-functional in the non-unique case., Statist. Neerlandica 58 83–96.
  • Gasser, T., Sroka, L. and Jennen-Steinmetz, C. (1986). Residual variance and residual pattern in nonlinear regression., Biometrika 73 625–633.
  • Goldenshluger, A., Tsybakov, A. and Zeevi, A. (2006). Optimal change-point estimation from indirect observations., Ann. Statist. 34 350–372.
  • Goldenshluger, A., Juditsky, A., Tsybakov, A. B. and Zeevi, A. (2008). Change-point estimation from indirect observations. I. Minimax complexity., Ann. Inst. Henri Poincaré Probab. Stat. 44 787–818.
  • Gut, A. and Steinebach, J. (2009). Truncated sequential change-point detection based on renewal counting processes. II., J. Statist. Plann. Inference 139 1921–1936.
  • Hall, P., Kay, J. W. and Titterington, D. M. (1990). Asymptotically optimal difference-based estimation of variance in nonparametric regression., Biometrika 77 521–528.
  • Hinkley, D. V. (1970). Inference about the change-point in a sequence of random variables., Biometrika 57 1–17.
  • Horváth, L., Horváth, Z. and Hušková, M. (2008). Ratio tests for change point detection. In, Beyond parametrics in interdisciplinary research: Festschrift in honor of Professor Pranab K. Sen. Inst. Math. Stat. Collect. 1 293–304. Inst. Math. Statist., Beachwood, OH.
  • Hušková, M. (1998). Estimators in the location model with gradual changes., Comment. Math. Univ. Carolin. 39 147–157.
  • Hušková, M., Kirch, C., Prášková, Z. and Steinebach, J. (2008). On the detection of changes in autoregressive time series. II. Resampling procedures., J. Statist. Plann. Inference 138 1697–1721.
  • Jarušková, D. (1998). Change-point estimator in gradually changing sequences., Comment. Math. Univ. Carolin. 39 551–561.
  • Kim, J. and Pollard, D. (1990). Cube root asymptotics., Ann. Statist. 18 191–219.
  • Korostelëv, A. P. (1987). Minimax estimation of a discontinuous signal., Teor. Veroyatnost. i Primenen. 32 796–799.
  • Korostelëv, A. P. and Tsybakov, A. B. (1993)., Minimax theory of image reconstruction. Lecture Notes in Statistics 82. Springer-Verlag, New York.
  • Lifshits, M. A. (1982). On the absolute continuity of the distributions of functionals of stochastic processes., Theory Probab. Appl. 27 600–607.
  • Loader, C. R. (1996). Change point estimation using nonparametric regression., Ann. Statist. 24 1667–1678.
  • Mallik, A., Sen, B., Banerjee, M. and Michailidis, G. (2011). Threshold estimation based on a $p$-value framework in dose-response and regression settings., Biometrika 98 887–900.
  • Müller, H.-G. (1992). Change-points in nonparametric regression analysis., Ann. Statist. 20 737–761.
  • Müller, H.-G. and Song, K.-S. (1996). A set-indexed process in a two-region image., Stochastic Process. Appl. 62 87–101.
  • Müller, H.-G. and Song, K.-S. (1997). Two-stage change-point estimators in smooth regression models., Statist. Probab. Lett. 34 323–335.
  • Neumann, M. H. (1997). Optimal change-point estimation in inverse problems., Scand. J. Statist. 24 503–521.
  • Pflug, G. C. (1983). The limiting log-likelihood process for discontinuous density families., Z. Wahrsch. Verw. Gebiete 64 15–35.
  • Robinson, P. M. (1997). Large-sample inference for nonparametric regression with dependent errors., Ann. Statist. 25 2054–2083.
  • Schlesinger, M. E. and Ramankutty, N. (1994). An oscillation in the global climate system of period 65-70 years., Nature 367 723-726.
  • Seijo, E. and Sen, B. (2011). Change-point in stochastic design regression and the bootstrap., Ann. Statist. 39 1580–1607.
  • Sen, B., Banerjee, M. and Woodroofe, M. (2010). Inconsistency of bootstrap: the Grenander estimator., Ann. Statist. 38 1953–1977.
  • Steland, A. (2010). A surveillance procedure for random walks based on local linear estimation., J. Nonparametr. Stat. 22 345–361.
  • van der Vaart, A. W. and Wellner, J. A. (1996)., Weak convergence and empirical processes: with applications to statistics. Springer Series in Statistics. Springer-Verlag, New York.
  • Willett, R. M. and Nowak, R. D. (2007). Minimax optimal level set estimation., IEEE Trans. Image Process. 16 2965–2979.
  • Wishart, J. (2009). Kink estimation with correlated noise., J. Korean Statist. Soc. 38 131–143.
  • Wishart, J. and Kulik, R. (2010). Kink estimation in stochastic regression with dependent errors and predictors., Electron. J. Stat. 4 875–913.
  • Wright, F. T. (1981). The asymptotic behavior of monotone regression estimates., Ann. Statist. 9 443–448.
  • Wu, W. B., Woodroofe, M. and Mentz, G. (2001). Isotonic regression: another look at the changepoint problem., Biometrika 88 793–804.
  • Zhao, O. and Woodroofe, M. (2012). Estimating a monontone trend., Statist. Sinica 22 359–378.