Statistical Science

Estimation in Discrete Parameter Models

Christine Choirat and Raffaello Seri

Full-text: Open access

Abstract

In some estimation problems, especially in applications dealing with information theory, signal processing and biology, theory provides us with additional information allowing us to restrict the parameter space to a finite number of points. In this case, we speak of discrete parameter models. Even though the problem is quite old and has interesting connections with testing and model selection, asymptotic theory for these models has hardly ever been studied. Therefore, we discuss consistency, asymptotic distribution theory, information inequalities and their relations with efficiency and superefficiency for a general class of $m$-estimators.

Article information

Source
Statist. Sci., Volume 27, Number 2 (2012), 278-293.

Dates
First available in Project Euclid: 19 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ss/1340110873

Digital Object Identifier
doi:10.1214/11-STS371

Mathematical Reviews number (MathSciNet)
MR2963996

Zentralblatt MATH identifier
1330.62306

Keywords
Discrete parameter space detection large deviations information inequalities efficiency superefficiency

Citation

Choirat, Christine; Seri, Raffaello. Estimation in Discrete Parameter Models. Statist. Sci. 27 (2012), no. 2, 278--293. doi:10.1214/11-STS371. https://projecteuclid.org/euclid.ss/1340110873


Export citation

References

  • [1] Bahadur, R. R. (1960). On the asymptotic efficiency of tests and estimates. Sankhyā 22 229–252.
  • [2] Bahadur, R. R. and Ranga Rao, R. (1960). On deviations of the sample mean. Ann. Math. Statist. 31 1015–1027.
  • [3] Baram, Y. (1978). A sufficient condition for consistent discrimination between stationary Gaussian models. IEEE Trans. Automat. Control 23 958–960.
  • [4] Baram, Y. and Sandell, N. R. Jr. (1977). An information theoretic approach to dynamical systems modeling and identification. In Proceedings of the 1977 IEEE Conference on Decision and Control (New Orleans, La., 1977), Vol. 1 1113–1118. Inst. Electrical Electron. Engrs., New York.
  • [5] Baram, Y. and Sandell, N. R. Jr. (1978). Consistent estimation on finite parameter sets with application to linear systems identification. IEEE Trans. Automat. Control 23 451–454.
  • [6] Baram, Y. and Sandell, N. R. Jr. (1978). An information theoretic approach to dynamical systems modeling and identification. IEEE Trans. Automat. Control AC-23 61–66.
  • [7] Barndorff-Nielsen, O. (1978). Information and Exponential Families in Statistical Theory. Wiley, Chichester.
  • [8] Barron, A. R. (1985). The strong ergodic theorem for densities: Generalized Shannon–McMillan–Breiman theorem. Ann. Probab. 13 1292–1303.
  • [9] Berger, J. O. (1993). Statistical Decision Theory and Bayesian Analysis. Springer, New York.
  • [10] Blackwell, D. and Hodges, J. L. Jr. (1959). The probability in the extreme tail of a convolution. Ann. Math. Statist. 30 1113–1120.
  • [11] Blyth, C. R. (1974). Necessary and sufficient conditions for inequalities of Cramér–Rao type. Ann. Statist. 2 464–473.
  • [12] Blyth, C. R. and Roberts, D. M. (1972). On inequalitites of Cramér–Rao type and admissibility proofs. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of Statistics 17–30. Univ. California Press, Berkeley, CA.
  • [13] Caines, P. E. (1975). A note on the consistency of maximum likelihood estimates for finite families of stochastic processes. Ann. Statist. 3 539–546.
  • [14] Caines, P. E. (1988). Linear Stochastic Systems. Wiley, New York.
  • [15] Chamberlain, G. (2000). Econometric applications of maxmin expected utility. J. Appl. Econometrics 15 625–644.
  • [16] Chapman, D. G. and Robbins, H. (1951). Minimum variance estimation without regularity assumptions. Ann. Math. Statist. 22 581–586.
  • [17] Choirat, C., Hess, C. and Seri, R. (2003). A functional version of the Birkhoff ergodic theorem for a normal integrand: A variational approach. Ann. Probab. 31 63–92.
  • [18] Clément, E. (1995). Modélisation statistique en finance et estimation de processus de diffusion. Ph.D. thesis, Université Paris 9 Dauphine.
  • [19] Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman & Hall, London.
  • [20] Daniels, H. E. (1954). Saddlepoint approximations in statistics. Ann. Math. Statist. 25 631–650.
  • [21] Dembo, A. and Zeitouni, O. (1998). Large Deviations Techniques and Applications, 2nd ed. Applications of Mathematics (New York) 38. Springer, New York.
  • [22] Feller, W. (1968). An Introduction to Probability Theory, Vol. 1, 3rd ed. Wiley, New York, NY.
  • [23] Finesso, L., Liu, C.-C. and Narayan, P. (1996). The optimal error exponent for Markov order estimation. IEEE Trans. Inform. Theory 42 1488–1497.
  • [24] Florens, J. P. and Richard, J. F. (1989). Encompassing in finite parametric spaces. Discussion Paper 89-03. Institute of Statistics and Decision Sciences, Duke University.
  • [25] Futschik, A. and Pflug, G. (1995). Confidence sets for discrete stochastic optimization. Ann. Oper. Res. 56 95–108.
  • [26] Geman, S. and Hwang, C.-R. (1982). Nonparametric maximum likelihood estimation by the method of sieves. Ann. Statist. 10 401–414.
  • [27] Geršanov, A. M. (1979). Optimal estimation of a discrete parameter. Teor. Veroyatnost. i Primenen. 24 220–224.
  • [28] Geršanov, A. M. and Šamroni, S. K. (1976). Randomized estimation in problems with a discrete parameter space. Teor. Verojatnost. i Primenen. 21 195–200.
  • [29] Ghosh, M. and Meeden, G. (1978). Admissibility of the mle of the normal integer mean. Sankhyā Ser. B 40 1–10.
  • [30] Gouriéroux, C. and Monfort, A. (1995). Statistics and Econometric Models. Cambridge Univ. Press, Cambridge.
  • [31] Grenander, U. (1981). Abstract Inference. Wiley, New York.
  • [32] Hall, P. (1989). On convergence rates in nonparametric problems. International Statistical Review 57 45–58.
  • [33] Hammersley, J. M. (1950). On estimating restricted parameters (with discussion). J. Roy. Statist. Soc. Ser. B 12 192–240.
  • [34] Hawkes, R. M. and Moore, J. B. (1976). Performance bounds for adaptive estimation. Proc. IEEE 64 1143–1150.
  • [35] Hawkes, R. M. and Moore, J. B. (1976). Performance of Bayesian parameter estimators for linear signal models. IEEE Trans. Automat. Control AC-21 523–527.
  • [36] Hawkes, R. M. and Moore, J. B. (1976). An upper bound on the mean-square error for Bayesian parameter estimators. IEEE Trans. Inform. Theory IT-22 610–615.
  • [37] Hero, A. E. (1999). Signal detection and classification. In Digital Signal Processing Handbook (V. K. Madisetti and D. B. Williams, eds.) Chapter 13. CRC Press, Boca Raton, FL.
  • [38] Hsuan, F. C. (1979). A stepwise Bayesian procedure. Ann. Statist. 7 860–868.
  • [39] Huber, P. J. (1972). The 1972 Wald lecture. Robust statistics: A review. Ann. Math. Statist. 43 1041–1067.
  • [40] Iltis, M. (1995). Sharp asymptotics of large deviations in $\mathbf{R}^{d}$. J. Theoret. Probab. 8 501–522.
  • [41] Jensen, J. L. (1995). Saddlepoint Approximations. Oxford Statistical Science Series 16. Oxford Univ. Press, New York.
  • [42] Jing, B.-Y. and Robinson, J. (1994). Saddlepoint approximations for marginal and conditional probabilities of transformed variables. Ann. Statist. 22 1115–1132.
  • [43] Kanaya, F. and Han, T. S. (1995). The asymptotics of posterior entropy and error probability for Bayesian estimation. IEEE Trans. Inform. Theory 41 1988–1992.
  • [44] Karlin, S. (1958). Admissibility for estimation with quadratic loss. Ann. Math. Statist. 29 406–436.
  • [45] Kester, A. D. M. and Kallenberg, W. C. M. (1986). Large deviations of estimators. Ann. Statist. 14 648–664.
  • [46] Khan, R. A. (1973). On some properties of Hammersley’s estimator of an integer mean. Ann. Statist. 1 756–762.
  • [47] Khan, R. A. (1978). A note on the admissibility of Hammersley’s estimator of an integer mean. Canad. J. Statist. 6 113–119.
  • [48] Khan, R. A. (2000). A note on Hammersley’s estimator of an integer mean. J. Statist. Plann. Inference 88 37–45.
  • [49] Khan, R. A. (2003). A note on Hammersley’s inequality for estimating the normal integer mean. Int. J. Math. Math. Sci. 34 2147–2156.
  • [50] Kleywegt, A. J., Shapiro, A. and Homem-de Mello, T. (2001/02). The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12 479–502.
  • [51] Korostelev, A. P. and Leonov, S. L. (1996). Minimax efficiency in the sense of Bahadur for small confidence levels. Problemy Peredachi Informatsii 32 3–15.
  • [52] Lainiotis, D. G. (1969). A class of upper bounds on probability of error for multi-hypothesis pattern recognition. IEEE Trans. Information Theory IT-15 730–731.
  • [53] Lainiotis, D. G. (1969). On a general relationship between estimation, detection, and the Bhattacharyya coefficient. IEEE Trans. Inform. Theory IT-15 504–505.
  • [54] LaMotte, L. R. (2008). Sufficiency in finite parameter and sample spaces. Amer. Statist. 62 211–215.
  • [55] Le Cam, L. (1953). On some asymptotic properties of maximum likelihood estimates and related Bayes’ estimates. Univ. California Publ. Statist. 1 277–329.
  • [56] Le Cam, L. and Yang, G. L. (2000). Asymptotics in Statistics: Some Basic Concepts, 2nd ed. Springer, New York.
  • [57] Lindsay, B. G. and Roeder, K. (1987). A unified treatment of integer parameter models. J. Amer. Statist. Assoc. 82 758–764.
  • [58] Liporace, L. A. (1971). Variance of Bayes estimates. IEEE Trans. Inform. Theory IT-17 665–669.
  • [59] Lugannani, R. and Rice, S. (1980). Saddle point approximation for the distribution of the sum of independent random variables. Adv. in Appl. Probab. 12 475–490.
  • [60] Manski, C. F. (1988). Analog Estimation Methods in Econometrics. Chapman & Hall, New York.
  • [61] McCabe, G. P. Jr. (1972). Sequential estimation of a Poisson integer mean. Ann. Math. Statist. 43 803–813.
  • [62] Meeden, G. and Ghosh, M. (1981). Admissibility in finite problems. Ann. Statist. 9 846–852.
  • [63] Nafie, M. and Tewfik, A. (1998). Reduced complexity M-ary hypotheses testing in wireless communications. In Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Seattle, Washington, 1998, Vol. 6 3209–3212. Inst. Electrical Electron. Engrs., New York.
  • [64] Newey, W. K. and McFadden, D. (1994). Large sample estimation and hypothesis testing. In Handbook of Econometrics, Vol. IV. Handbooks in Econom. 2 2111–2245. North-Holland, Amsterdam.
  • [65] Ney, P. (1983). Dominating points and the asymptotics of large deviations for random walk on $\mathbf{R}^{d}$. Ann. Probab. 11 158–167.
  • [66] Ney, P. (1984). Convexity and large deviations. Ann. Probab. 12 903–906.
  • [67] Ney, P. (1999). Notes on dominating points and large deviations. Resenhas 4 79–91.
  • [68] Ney, P. E. and Robinson, S. M. (1995). Polyhedral approximation of convex sets with an application to large deviation probability theory. J. Convex Anal. 2 229–240.
  • [69] Poor, H. V. and Verdú, S. (1995). A lower bound on the probability of error in multihypothesis testing. IEEE Trans. Inform. Theory 41 1992–1994.
  • [70] Puhalskii, A. and Spokoiny, V. (1998). On large-deviation efficiency in statistical inference. Bernoulli 4 203–272.
  • [71] Robert, C. P. (1994). The Bayesian Choice. Springer, New York.
  • [72] Robinson, J., Höglund, T., Holst, L. and Quine, M. P. (1990). On approximating probabilities for small and large deviations in $\mathbf{R}^{d}$. Ann. Probab. 18 727–753.
  • [73] Robson, D. S. (1958). Admissible and minimax integer-valued estimators of an integer-valued parameter. Ann. Math. Statist. 29 801–812.
  • [74] Silvey, S. D. (1961). A note on maximum-likelihood in the case of dependent random variables. J. Roy. Statist. Soc. Ser. B 23 444–452.
  • [75] Stark, A. E. (1975). Some estimators of the integer-valued parameter of a Poisson variate. J. Amer. Statist. Assoc. 70 685–689.
  • [76] Teunissen, P. J. G. (2007). Best prediction in linear models with mixed integer/real unknowns: Theory and application. J. Geod. 81 759–780.
  • [77] Torgersen, E. N. (1970). Comparison of experiments when the paramenter space is finite. Z. Wahrsch. Verw. Gebiete 16 219–249.
  • [78] Vajda, I. (1967). On the statistical decision problems with discrete parameter space. Kybernetika (Prague) 3 110–126.
  • [79] Vajda, I. (1967). On the statistical decision problems with finite parameter space. Kybernetika (Prague) 3 451–466.
  • [80] Vajda, I. (1967). Rate of convergence of the information in a sample concerning a parameter. Czechoslovak Math. J. 17 (92) 225–231.
  • [81] Vajda, I. (1968). On the convergence of information contained in a sequence of observations. In Proc. Colloquium on Information Theory (Debrecen, 1967), Vol. II 489–501. János Bolyai Math. Soc., Budapest.
  • [82] Vajda, I. (1971). A discrete theory of search. I. Apl. Mat. 16 241–255.
  • [83] Vajda, I. (1971). A discrete theory of search. II. Apl. Mat. 16 319–335.
  • [84] Vajda, I. (1974). On the convergence of Bayes empirical decision functions. In Proceedings of the Prague Symposium on Asymptotic Statistics (Charles Univ., Prague, 1973), Vol. II 413–425. Charles Univ., Prague.
  • [85] van der Vaart, A. W. (1997). Superefficiency. In Festschrift for Lucien Le Cam 397–410. Springer, New York.
  • [86] van der Vlerk, M. H. (1996–2007). Stochastic integer programming bibliography. Available at http://www.eco.rug.nl/mally/biblio/sip.html.
  • [87] Wong, W. H. (1986). Theory of partial likelihood. Ann. Statist. 14 88–123.