Abstract
Let $Z_i:-\infty<i <+\infty}$ be a strictly stationary $\infty$ -mixing sequence. Without specifying the dependence model giving rise to ${Z_i}$ and without specifying the marginal distribution of $Z_i$ , we address the question of variance estimation for a general statistic $t_n=t_n (Z_1,...,Z_n)$. For estimating $Var{t_n}$ from just the available data $(Z_1,...,Z_n)$ we propose computing subseries values: $t_m(Z_{i+1},Z_{i+2}, Z_{i+m})$, $0\leq i<i+m\leq n$. These subseries values are used as replicates to model the sampling variability of $t_n$. In particular, we use adjacent nonoverlapping subseries of length $m = m_n$, with $m_\rightarrow\ infty$ and $m_n/n\rigtharrow 0$. Our variance estimator is just the usual sample variance computed amongst these subseries values (after appropriate standardization). This estimator is shown to be consistent under mild integrability conditions. We present optimal (i.e.,minimum m.s.e.) choices of $m_n$ for the special case where $t_n=\overset{-}{Z}_n$ and ${Z_i}$ is a normal AR(1) sequence. A simulation study is conducted, showing that those same choices of $m_n$ are effective when $t_n$ is a robust estimator of location and ${Z_i}$ is subject to contamination.
Citation
Edward Carlstein. "The Use of Subseries Values for Estimating the Variance of a General Statistic from a Stationary Sequence." Ann. Statist. 14 (3) 1171 - 1179, September, 1986. https://doi.org/10.1214/aos/1176350057
Information