Annals of Statistics

The two-sample problem for Poisson processes: Adaptive tests with a nonasymptotic wild bootstrap approach

Magalie Fromont, Béatrice Laurent, and Patricia Reynaud-Bouret

Full-text: Open access

Abstract

Considering two independent Poisson processes, we address the question of testing equality of their respective intensities. We first propose testing procedures whose test statistics are $U$-statistics based on single kernel functions. The corresponding critical values are constructed from a nonasymptotic wild bootstrap approach, leading to level $\alpha$ tests. Various choices for the kernel functions are possible, including projection, approximation or reproducing kernels. In this last case, we obtain a parametric rate of testing for a weak metric defined in the RKHS associated with the considered reproducing kernel. Then we introduce, in the other cases, aggregated or multiple kernel testing procedures, which allow us to import ideas coming from model selection, thresholding and/or approximation kernels adaptive estimation. These multiple kernel tests are proved to be of level $\alpha$, and to satisfy nonasymptotic oracle-type conditions for the classical $\mathbb{L} _{2}$-norm. From these conditions, we deduce that they are adaptive in the minimax sense over a large variety of classes of alternatives based on classical and weak Besov bodies in the univariate case, but also Sobolev and anisotropic Nikol’skii–Besov balls in the multivariate case.

Article information

Source
Ann. Statist., Volume 41, Number 3 (2013), 1431-1461.

Dates
First available in Project Euclid: 1 August 2013

Permanent link to this document
https://projecteuclid.org/euclid.aos/1375362555

Digital Object Identifier
doi:10.1214/13-AOS1114

Mathematical Reviews number (MathSciNet)
MR3113817

Zentralblatt MATH identifier
1273.62102

Subjects
Primary: 62G09: Resampling methods 62G10: Hypothesis testing 62G55
Secondary: 62G20: Asymptotic properties

Keywords
Two-sample problem Poisson process bootstrap adaptive tests minimax separation rates kernel methods aggregation methods multiple kernel

Citation

Fromont, Magalie; Laurent, Béatrice; Reynaud-Bouret, Patricia. The two-sample problem for Poisson processes: Adaptive tests with a nonasymptotic wild bootstrap approach. Ann. Statist. 41 (2013), no. 3, 1431--1461. doi:10.1214/13-AOS1114. https://projecteuclid.org/euclid.aos/1375362555


Export citation

References

  • [1] Arcones, M. A. and Giné, E. (1992). On the bootstrap of $U$ and $V$ statistics. Ann. Statist. 20 655–674.
  • [2] Bach, F. R. (2008). Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9 1179–1225.
  • [3] Baraud, Y. (2002). Non-asymptotic minimax rates of testing in signal detection. Bernoulli 8 577–606.
  • [4] Baraud, Y., Huet, S. and Laurent, B. (2003). Adaptive tests of linear hypotheses by model selection. Ann. Statist. 31 225–251.
  • [5] Bartlett, P. L., Boucheron, S. and Lugosi, G. (2002). Model selection and error estimation. Machine Learning 48 85–113.
  • [6] Bovett, J. M. and Saw, J. G. (1980). On comparing two Poisson intensity functions. Comm. Statist. Theory Methods 9 943–948.
  • [7] Bretagnolle, J. (1983). Lois limites du bootstrap de certaines fonctionnelles. Ann. Inst. H. Poincaré Sect. B (N.S.) 19 281–296.
  • [8] Butucea, C. and Tribouley, K. (2006). Nonparametric homogeneity tests. J. Statist. Plann. Inference 136 597–639.
  • [9] Chapelle, O., Vapnik, V., Bousquet, O. and Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learing 46 131–159.
  • [10] Chiu, S. N. (2010). Parametric bootstrap and approximate tests for two Poisson variates. J. Stat. Comput. Simul. 80 263–271.
  • [11] Chiu, S. N. and Wang, L. (2009). Homogeneity tests for several Poisson populations. Comput. Statist. Data Anal. 53 4266–4278.
  • [12] Cox, D. R. (1953). Some simple approximate tests for Poisson variates. Biometrika 40 354–360.
  • [13] Daley, D. J. and Vere-Jones, D. (2008). An Introduction to the Theory of Point Processes. Vol. II: General Theory and Structure, 2nd ed. Springer, New York.
  • [14] de la Peña, V. H. and Giné, E. (1999). Decoupling: From Dependence to Independence. Springer, New York.
  • [15] Dehling, H. and Mikosch, T. (1994). Random quadratic forms and the bootstrap for $U$-statistics. J. Multivariate Anal. 51 392–413.
  • [16] Deshpande, J. V., Mukhopadhyay, M. and Naik-Nimbalkar, U. V. (1999). Testing of two sample proportional intensity assumption for non-homogeneous Poisson processes. J. Statist. Plann. Inference 81 237–251.
  • [17] Deshpande, J. V. and Sengupta, D. (1995). Testing the hypothesis of proportional hazards in two populations. Biometrika 82 252–261.
  • [18] Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1–26.
  • [19] Fisher, R. A. (1935). The Design of Experiments. Oliver & Boyd, Edinburgh.
  • [20] Fromont, M., Laurent, B. and Reynaud-Bouret, P. (2011). Adaptive tests of homogeneity for a Poisson process. Ann. Inst. Henri Poincaré Probab. Stat. 47 176–213.
  • [21] Fromont, M., Laurent, B. and Reynaud-Bouret, P. (2013). Supplement to “The two-sample problem for Poisson processes: Adaptive tests with a non-asymptotic wild bootstrap approach.” DOI:10.1214/13-AOS1114SUPP.
  • [22] Gail, M. (1974). Computations for designing comparative Poisson trials. Biometrics 30 231–237.
  • [23] Giné M., E. (1975). Invariant tests for uniformity on compact Riemannian manifolds based on Sobolev norms. Ann. Statist. 3 1243–1266.
  • [24] Goldenshluger, A. and Lepski, O. (2011). Bandwidth selection in kernel density estimation: Oracle inequalities and adaptive minimax optimality. Ann. Statist. 39 1608–1632.
  • [25] Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B. and Smola, A. (2008). A kernel method for the two-sample problem. J. Mach. Learn. Res. 1 1–10.
  • [26] Gretton, A., Sriperumbudur, B. K., Sejdinovic, D., Strathmann, H., Balakrishnan, S., Pontil, M. and Fukumizu, K. (2012). Optimal kernel choice for large-scale two-sample tests. In Advances in Neural Information Processing Systems (NIPS) 25 1214–1222. Available at http://books.nips.cc/papers/files/nips25/NIPS2012_0592.pdf.
  • [27] Hoeffding, W. (1952). The large-sample power of tests based on permutations of observations. Ann. Math. Statistics 23 169–192.
  • [28] Horowitz, J. L. and Spokoiny, V. G. (2001). An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 69 599–631.
  • [29] Hušková, M. and Janssen, P. (1993). Consistency of the generalized bootstrap for degenerate $U$-statistics. Ann. Statist. 21 1811–1823.
  • [30] Ingster, Y. and Stepanova, N. (2011). Estimation and detection of functions from anisotropic Sobolev classes. Electron. J. Stat. 5 484–506.
  • [31] Ingster, Y. I. (1993). Asymptotically minimax hypothesis testing for nonparametric alternatives. I, II, III. Math. Methods Statist. 2 85–114, 171–189, 249–268.
  • [32] Ingster, Y. I. (2000). Adaptive chi-square tests. J.Math. Sci. 99 1110–1119.
  • [33] Ingster, Y. I. and Kutoyants, Y. A. (2007). Nonparametric hypothesis testing for intensity of the Poisson process. Math. Methods Statist. 16 217–245.
  • [34] Janssen, A. and Pauls, T. (2003). How do bootstrap and permutation tests work? Ann. Statist. 31 768–806.
  • [35] Janssen, P. (1994). Weighted bootstrapping of $U$-statistics. J. Statist. Plann. Inference 38 31–41.
  • [36] Koltchinskii, V. (2001). Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory 47 1902–1914.
  • [37] Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Statist. 34 2593–2656.
  • [38] Koltchinskii, V. and Yuan, M. (2010). Sparsity in multiple kernel learning. Ann. Statist. 38 3660–3695.
  • [39] Krishnamoorthy, K. and Thomson, J. (2004). A more powerful test for comparing two Poisson means. J. Statist. Plann. Inference 119 23–35.
  • [40] Lanckriet, G. R. G., Cristianini, N., Bartlett, P., El Ghaoui, L. and Jordan, M. I. (2003/04). Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5 27–72.
  • [41] Latała, R. (1999). Tail and moment estimates for some types of chaos. Studia Math. 135 39–53.
  • [42] Lounici, K. and Nickl, R. (2011). Global uniform risk bounds for wavelet deconvolution estimators. Ann. Statist. 39 201–231.
  • [43] Mammen, E. (1992). Bootstrap, wild bootstrap, and asymptotic normality. Probab. Theory Related Fields 93 439–455.
  • [44] Micchelli, C. A. and Pontil, M. (2005). Learning the kernel function via regularization. J. Mach. Learn. Res. 6 1099–1125.
  • [45] Ng, H. K. T., Gu, K. and Tang, M. L. (2007). A comparative study of tests for the difference of two Poisson means. Comput. Statist. Data Anal. 51 3085–3099.
  • [46] Præstgaard, J. and Wellner, J. A. (1993). Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 21 2053–2086.
  • [47] Præstgaard, J. T. (1995). Permutation and bootstrap Kolmogorov–Smirnov tests for the equality of two distributions. Scand. J. Stat. 22 305–322.
  • [48] Przyborowski, J. and Wilenski, H. (1940). Homogeneity of results in testing samples from Poisson series with an application to testing clover seed for dodder. Biometrika 31 313–323.
  • [49] Rigollet, P. and Tsybakov, A. B. (2007). Linear and convex aggregation of density estimators. Math. Methods Statist. 16 260–280.
  • [50] Romano, J. P. (1988). A bootstrap revival of some nonparametric distance tests. J. Amer. Statist. Assoc. 83 698–708.
  • [51] Romano, J. P. (1989). Bootstrap and randomization tests of some nonparametric hypotheses. Ann. Statist. 17 141–159.
  • [52] Romano, J. P. and Wolf, M. (2005). Exact and approximate stepdown methods for multiple hypothesis testing. J. Amer. Statist. Assoc. 100 94–108.
  • [53] Rubin, D. B. (1981). The Bayesian bootstrap. Ann. Statist. 9 130–134.
  • [54] Schölkopf, B. and Smola, A. J. (2002). Learning with Kernels. MIT Press, Cambridge, MA.
  • [55] Shiue, W. and Bain, L. J. (1982). Experiment size and power comparisons for two-sample Poisson tests. J. R. Stat. Soc. Ser. C Appl. Stat. 31 130–134.
  • [56] Spokoiny, V. G. (1996). Adaptive hypothesis testing using wavelets. Ann. Statist. 24 2477–2498.
  • [57] Spokoiny, V. G. (1998). Adaptive and spatially adaptive testing of a nonparametric hypothesis. Math. Methods Statist. 7 245–273.
  • [58] Sriperumbudur, B. K., Fukumizu, K., Gretton, A., Lanckriet, G. R. G. and Schölkopf, B. (2009). Kernel choice and classifiability for RKHS embeddings of probability distributions. In Advances in Neural Information Processing Systems (NIPS) 22 1750–1758. Available at http://books.nips.cc/papers/files/nips22/NIPS2009_0893.pdf
  • [59] Sriperumbudur, B. K., Fukumizu, K. and Lanckriet, G. R. G. (2011). Universality, characteristic kernels and RKHS embedding of measures. J. Mach. Learn. Res. 12 2389–2410.
  • [60] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York.
  • [61] Wellner, J. A. (1979). Permutation tests for directional data. Ann. Statist. 7 929–943.

Supplemental materials

  • Supplementary material: Simulation study and additional proofs. A simulation study, the proofs of Proposition 1, Theorems 3 and 4, and of Corollaries 1, 2 and 3 are given in the supplementary material [21].