SIMEX and standard error estimation in semiparametric measurement error models

Abstract: SIMEX is a general-purpose technique for measurement error correction. There is a substantial literature on the application and theory of SIMEX for purely parametric problems, as well as for purely nonparametric regression problems, but there is neither application nor theory for semiparametric problems. Motivated by an example involving radiation dosimetry, we develop the basic theory for SIMEX in semiparametric problems using kernel-based estimation methods. This includes situations that the mismeasured variable is modeled purely parametrically, purely nonparametrically, or that the mismeasured variable has components that are modeled both parametrically and nonparametrically. Using our asymptotic expansions, easily computed standard error formulae are derived, as are the bias properties of the nonparametric estimator. The standard error method represents a new method for estimating variability of nonparametric estimators in semiparametric problems, and we show in both simulations and in our example that it improves dramatically on first order methods. We find that for estimating the parametric part of the model, standard bandwidth choices of order O(n−1/5) are sufficient to ensure asymptotic normality, and undersmoothing is not required. SIMEX has the property that it fits misspecified models, namely ones that ignore the measurement error. Our work thus also more generally describes the behavior of kernelbased methods in misspecified semiparametric problems.


Introduction
Regression models with measurement errors arise frequently in practice and have attracted attention in the statistics literature.A semiparametric regression model with errors in variable has been considered by several authors in the attempt to develop a measurement error calibration when the errors are in the linear part of linear regression [17], or generalized linear regression [18].[43] used a method of moments and deconvolution to construct the calibration for the case of partially linear models when the mismeasured covariate appears in parametric and nonparametric parts.SIMEX has considered only in the case of partially linear model and when the measurement errors are in the linear part [13].However, we are after very general results, and are not restricting attention to simple partially linear models.
Indeed, the purpose of this paper is to derive the general theory for a popular alternative to regression-calibration, a simulation-extrapolation method (SIMEX) in the situations where the mismeasured variable is modeled purely parametrically, purely nonparametrically, or that the mismeasured variable has components that are modeled both parametrically and nonparametrically.
The SIMEX method [8,37] is a general-purpose, widely applicable method for correcting parameter estimates for the biases induced by measurement error in covariates.It is a functional method, in the sense that it makes no assumptions about the distribution of the unobserved true covariate.
A major strength of SIMEX is that it is extremely easy to implement: it requires only a program for computing estimates in the absence of measurement error, and the ability to simulate adding additional measurement error to the process.A brief description of the method is given in Section 2: many more details are available in [4].There is a suite a programs for implementing SIMEX in Stata, as well as in R at http://cran.r-mirror.de/src/contrib/Descriptions/simex.html.
While SIMEX is well-studied in purely parametric and purely nonparametric problems, to the best of our knowledge, SIMEX was considered only in the case of partially linear model and when the measurement errors are in the linear part [13].In this paper we derive the general theory of SIMEX in semiparametric problems and describe a simple method to estimate standard errors for both the parametric and the nonparametric parts of the model.We also describe the first order bias properties of the nonparametric estimators.We show using our example that the standard error method improves dramatically on first order methods for estimating standard errors of the nonparametric components, and is thus of importance even when there is no measurement error.
An outline of this paper is as follows.In Section 2, we briefly define the SIMEX method.Section 3 briefly describes well-known standard estimation algorithms appropriate to semiparametric estimation.Sections 4 and 5 state the main results for the cases that the variable measured with error is modeled nonparametrically and parametrically, respectively.In the latter section we also add a brief discussion of what happens when the mismeasured variable is in both parts of the model.Importantly, both these sections describe sim-ple methods for constructing standard error estimates of both parametric and nonparametric components, ones that in our example and simulations improve greatly upon first order methods.Section 6 presents a simulation study to evaluate the performance of SIMEX itself and more importantly our approach to variance estimation in semiparametric problems.In Section 7, we describe an important, complex problem in radiation epidemiology involving measurement error.Section 8 gives concluding remarks.All technical details are sketched in an Appendix.

SIMEX
The SIMEX method is both well-known and straightforward.Suppose we have a problem in which the predictor X cannot be observed, but we observe W = X + U , where U = Normal(0, Σ u ).Some components of U may equal zero indicating no measurement error in that component.Consider a set of values 0 • In the simulation step, additional independent measurement errors with covariance matrix λ j Σ u are generated and added to the original W data, thereby creating data sets with successively larger measurement error variances.For the j th data set, the total measurement error variance is • Next, estimates are obtained from each of the generated contaminated data sets, using an algorithm that would have been used if there were no measurement error, i.e., one's favorite method.• The simulation and estimation steps are repeated a large number B times, and the average value of the estimate for each level of contamination λ j is calculated.These averages are plotted against the λ values and a regression technique, for example, polynomial least squares, is used to fit an extrapolant function to the averaged, error-contaminated estimates.• Extrapolation to the ideal case of no measurement error (λ = −1) yields the SIMEX estimate.

Remark 1
In general parametric problems, the "favorite method" will be equivalent to solving an estimating function.Because of the measurement error in X, the estimating function is biased because the underlying model is misspecified.This causes no difficulty, since it is well-known what happens to estimating function methods for misspecified models.Similarly, in nonparametric regression, the "favorite method" will be equivalent to solving a local estimating equation, and the same general remarks apply.
Remark 2 Semiparametric problems, in contrast, combine global and local estimating equations, and their properties under misspecified models are not wellunderstood.For example, see Ai and Chen (2007), for a sieve-based approach to this general issue.Our work can be looked at as addressing the problem is misspecified semiparametric models within the kernel framework, which is then applied to the SIMEX algorithm itself.

Likelihood functions and estimators
This paper considers a wide class of semiparametric problems with some covariates modeled parametrically and one covariate entering the model through a nonparametric function.Important examples of such model are the partially linear model and partially linear logistic model considered in Section 6.For example, with X measured with error and Z measured exactly, and H(•) being the logistic distribution function, the partially linear logistic models with response Y can take either of two forms: In the first model, the variable measured with error is modeled parametrically, while in the second model it is modeled nonparametrically.We will derive results for both cases, and of course for much more general models.SIMEX is based on repeatedly adding more measurement error and then calculating a standard estimator on this remeasured data set.In this section, we describe the well-known standard basic profile algorithm when estimation in the no-measurement error context is based upon maximization of a criterion function, e.g., a loglikelihood function.We will phrase things in terms of loglikelihoods, but as in [20], the case of any criterion function is similar.
Let K(•) be a smooth symmetric density function with bounded support, let h be a bandwidth, and let Here we follow a traditional notation from the measurement error literature, where Y denotes the response, X denotes the covariate(s) measured with error and Z denotes the other covariate(s).Likelihood problems take two forms, depending on whether X is modeled parametrically or nonparametrically.In the former case, the loglikelihood is L{Y, X, θ(Z), β}, while in the latter the loglikelihood is L{Y, Z, θ(X), β}.
To handle both of these cases, we define generic random variables (A, D), and consider the loglikelihood as L{Y, A, θ(D), β}, where D is scalar.For our notation about derivatives, make the following definitions: etc.Then, for any β * , we estimate θ(•) by θ(z 0 , β * ) via standard local likelihood estimation [5].We consider a local linear estimator, which is (1, 0) × α, where α solves the local loglikelihood score equation In general, θ(z 0 , β * ) converges to θ(z 0 , β * ), where By differentiation of (2), the derivative of θ(z 0 , β * ) with respect to β * , call it θ β (z 0 , β * ), satisfies The profile method maximizes n i=1 L{Y i , A i , θ(D i , β), β} in β.Let θ β (z 0 , β) be the derivative of θ(z 0 , β) with respect to β.Then profiling is equivalent to solving the equation The solution, β, converges to β PF , which satisfies and because of (2), this means that β PF , satisfies Remark 3 For estimating β PF , it is known that the profile estimate of β PF is asymptotically normally distributed when the bandwidth h ∝ n −a for 1/5 ≤ a ≤ 1/3, and that the optimal rate for estimating θ(•) is h ∝ n −1/5 .See [20] for an definition and discussion of an alternative backfitting approach.
and the simulated random variables U ib = Normal(0, Σ u ).We then apply profile likelihood for each b and each λ where we replace X i by W ib (λ).
If we fit a quadratic model Also define Result 1 For fixed B and as n → ∞, we have the asymptotic expansion The nonparametric estimator differs from θ simex (x 0 ) with large sample bias given as bias = (φ 2 h 2 /2) Finally, since B can be made as large as we want, terms of order O{(Bnh) −1 + n −1 } are much smaller than terms of order O{(nh) −1 }.Ignoring these terms, the limiting variance is given as

Estimation for the nonparametric part
At first glance, estimating the variance in ( 9) is quite easy, because its form only depends on the case that λ = 0 through s 2 1 , and standard techniques apply.This unique result is a generalization of a result found by [3] for the Gaussian nonparametric case.Asymptotically, the variance of the SIMEX estimate is just a factor s 2 1 larger than the variance of nonparametric regression that ignores measurement error.As described above, with quadratic extrapolation and the λ-values (0.0, 0.5, 1.0, 1.5, 2.0), s 1 = 3 and the SIMEX estimate is asymptotically 9 times more variable than the estimate that ignores measurement error.
However, as we demonstrate numerically in Sections 6 and 7, this asymptotic result, while interesting, is not useful in finite samples.
In the case of classical nonparametric regression, [38] recognized that the SIMEX estimator is linear in the (Y i ), so that an explicit expression of its variance is available if var{Y |W b (λ)} is known: they then estimate var{Y |W b (λ)}.In our case, not only do we have the issue of the parametric component, but also there is no closed-form formula for the variance of the SIMEX estimator.
In order to get roughly believable standard errors, we turn to the theory, particularly to equations ( 8) and (A.6) of the appendix, the former being an asymptotic expansion for the parametric part, the latter for the nonparametric part.The estimate of the nonparametric function is just and to create reasonable standard error estimates we need to account for the variability in β b (λ j ) via (8), the variability in θ b {x 0 , β(λ j ), λ j } via (A.6) and the fact that the number of SIMEX steps B is finite.Define β(λ j ) = B −1 B b=1 β b (λ j ).Using a Taylor series involving β b (λ j ), these formulae imply that for fixed B, a consistent estimate of the variance of the estimator θ simex (x 0 ) (see appendix A.6.1 for details) is n −1 times the sample variance of the terms J j=1 where • ib is the estimated version of • ib and Σ is estimated by its sample counterpart Σ. Asymptotically, if B = ∞ and there are only a finite number of λ-values, only the first term with j = 1 contributes to the first order variance, but we found in both the simulations of Section 6 and the empirical example in Section 7 that ignoring the other terms causes an underestimation of standard error by as much as 50%.In our numerical work we estimate the variance as an average of squares of terms from (10).

Estimation for the parametric part
, so that we need to estimate Σ(λ) and T jk .Estimation of these two terms is relatively straightforward via plug-in methods.For example, constituent terms such as E{L ββ (• ib )|W ib (λ)} can be estimated by pooling across b = 1, . . ., B and using nonparametric regression of L ββ [Y i , Z i , θ{W ib (λ), β(λ), λ}, β(λ)] on the W ib (λ).In practice, to save storage requirements, one can use only a few of the SIMEX samples, do the parametric regression on a wide grid, and then interpolate for the other SIMEX samples.Having estimated the terms ξ i,B (λ j ), then T jk can be estimated via the sample covariance of ξ i,B (λ j ) and ξ i,B (λ k ) In practice, there is often little need to adjust the standard errors for the estimation of the nonparametric component.The reason is that the semiparametric profile algorithm involves a projection similar to that which occurs in parametric problems.

Introduction and theoretical development
Here we consider the case that X is modeled parametrically and scalar Z is modeled nonparametrically.The loglikelihood or criterion function then takes the form L{Y, X, θ(Z), β}.
For any λ, define β(λ) and θ{z 0 , β(λ), λ} as the solutions to the values to which the profile likelihood estimates converge for given λ.Also define By definition, ξ i,B (λ) has mean zero, while χ i,B (λ) has mean zero conditional on Z i .
Result 2 For fixed B and as n → ∞, we have the asymptotic expansion The nonparametric estimator differs from θ simex (z 0 ) with large sample bias and variance bias = (φ 2 h 2 /2)

Standard error estimation
Result 2 and the expansions given in the Appendix make it easy to construct asymptotically correct variance and standard error estimates.

Estimation for the nonparametric part
Calculation of an asymptotic variance estimate for θ simex (z 0 ) follows a similar strategy as in Section 4.2.1.In the form of the variance given in (12), the density function f Z (•) can be estimated by any kernel density estimate.To estimate Ω Z {z 0 , β(λ), λ}, first form the averages over b = 1, . . ., B of the terms L θθ [Y i , W ib (λ), θ{Z i , β(λ), λ}, β(λ)] and then regress these averages on Z via nonparametric regression.Let χ i,B (λ) being the obvious estimate of χ i,B (λ).
As in the technical arguments in Section A.6, the analogue of Section 4.2.1 is to estimate the variance of θ simex (z 0 ), while taking into account the estimation of the parametric part, as the sample variance of the following terms:

Estimation for the parametric part
First consider β simex and define T jk = cov{ξ i,B (λ j ), ξ i,B (λ k )}.Then the asymptotic covariance matrix for β simex is just n −1 J j,k=1 s j s k Σ −1 (λ j )T jk Σ −1 (λ k ), so that we need to estimate Σ(λ) and T jk .The former can be done by simple plug-in methods which we describe next.The terms T jk are also readily estimated via plug-in rules.Let ξ i,B (λ j ) is the obvious estimate of ξ i,B (λ j ).Then T jk can be estimated as the sample covariance matrix between ξ i,B (λ j ) and ξ i,B (λ k ).

When both X and Z are mismeasured
Our results also apply to the case that both X and Z are subject to measurement error.Consider the problem in Section 4 that Z is modeled parametrically, and that X is modeled nonparametrically.Suppose that instead of observing X we observe W = X + U , instead of Z we observe V = Z + V , and that Then all the results in Section 4, including standard error estimation, go through if Z is replaced everywhere by V b (λ).Thus, for example, we replace (5) by Similar substitutions are made everywhere else.

The partially linear model
We generated data following the partially linear model We take the true values as β 0 = (1, 1) T and θ 0 (x) = 0.5 * cos(2 * x).In addition ǫ = Normal(0, 1) and X = Uniform(0, π).We generated Z = (Z 1 , Z 2 ) T from a Normal distribution with identity covariance matrix and conditional mean We assumed that the nonparametrically modeled variable X is measured with error, that is, we observe W = X + U with U = Normal(0, σ 2 u ) with σ 2 u = 0.16.The sample size was n = 200.Nonparametric regression was performed using the Epanechnikov kernel with bandwidth h = σ w n −1/5 , where σ w is the standard deviation of the observed predictors W .We generated 1, 000 data sets and for each of the generated data sets, we used B = 200 SIMEX simulated data sets and used (λ 1 , . . ., λ 5 ) = (0.0, 0.5, . .., 2.0).We considered both quadratic and cubic extrapolants with similar results: only the quadratic extrapolant results are given here as the same general conclusions applied in the cubic case.
Of course, in this context the SIMEX estimates are linear in the responses, and it is possible to create estimated standard errors for the nonparametric component estimation using this fact.However, here the purpose is to test the general method described in Section 4.2.1, and not to see whether ad hoc methods work well.
The estimated pointwise standard errors of SIMEX fits for θ(x) are given in Figure 1 for quadratic extrapolation.The solid line is our new method of standard error estimation, the dot-dashed line is the usual pointwise standard error estimates using our method to account for finite B but ignoring the asymptotically negligible variability of the parametric part (that is, the sample variance of the first term of ( 10)), and the dashed line represents the pointwise standard error across 1,000 simulations.The conclusion is clear: ignoring the variability of the parametric part leads to a badly biased estimated standard error for the SIMEX procedure.In contrast, our new method of estimating standard error gives results more in line with reality.
We have not displayed the "large-B, known β" asymptotic formula (9).However, that method gave standard error estimates whose mean was grossly too large, more than double the actual standard errors.See Figure 5 for an example of this over-estimation in the empirical context of Section 7. Results from the simulation study in Section 6.1 for the partially linear model with measurement error in the covariate modeled nonparametrically.Displayed are the pointwise estimated standard errors of SIMEX estimate of θ(x) using quadratic extrapolation.Solid line: our method as described in Section 4.2.1.Dot-dashed line: standard errors when the variability of the parametric part is not taken into account.Dashed line: pointwise standard errors of θ(z) over 1,000 simulations.
In results not reported here, we also considered standard error estimation for the parametric part.Our standard error estimates were in very close agreement with the actual standard deviations from the simulation, and coverage probabilities for the SIMEX limiting value of β also gave very close to nominal coverage probabilities.

Partially linear logistic model
We generated data following the logistic model where H(•) denotes the logistic distribution function.We took the true values of β 1 = β 2 = 0.5 and θ 0 (x) = 0.5 * cos(2 * x) − 1.We generated X from Uniform(0, π) and Z from a Normal distribution with variance 1.0 and conditional mean E(Z|X) = −X.We assume that the nonparametrically modeled variable X is measured with error, that is, we observe W = X + U with U = Normal(0, σ 2 u ) with σ 2 u = 0.16.We use sample size n = 500.Nonparametric regression was performed using the Epanechnikov kernel and we used bandwidth h = σ w n −1/5 , where σ w is the standard deviation of the observed predictors W .We generated 2, 000 data sets and for each of the generated data sets, we used B = 200 SIMEX simulated data sets and took (λ 1 , . . ., λ 5 ) = (0.0, 0.5, . .., 2.0).We considered both quadratic and cubic extrapolants: only the quadratic results are reported here as the same general conclusions applied in the cubic case.
The estimated pointwise standard errors of SIMEX fits for θ(x) are given in Figure 2 for quadratic extrapolation.The solid line is our new method of standard error estimation, the dot-dashed line is the usual pointwise standard error estimates ignoring the variability of the parametric part (that is, the sample variance of the first term of ( 10)), and the dashed line represents the pointwise standard error across 1,000 simulations.The conclusion is clear: ignoring the variability of the parametric part leads to a noticeable biased estimated standard error for the SIMEX procedure, although the bias is not as serious as in the linear case.In contrast, our new method of estimating standard error gives results more in line with reality.
In results not reported here, we also considered standard error estimation for the parametric part.Our standard error estimates were again in very close agreement with the actual standard deviations from the simulation, and coverage probabilities for the SIMEX limiting value of β also gave very close to nominal coverage probabilities.

Why SIMEX?
This section gives some brief numerical evidence of the power of SIMEX in semiparametric problems.We performed a simulation of a semiparametric model in which the variable measured with error, X, was modeled parametrically.In the parametric simulation, we had Z = Uniform [0, π], X = Normal(−1.0,1.0), n = 500, β 1 = β 2 = 0.7, σ 2 u = 0.16, var(Y |X, Z) = 0.5 and the model was that We used (λ 1 , . . ., λ 5 ) = (0.0, 0.5, 1.0, 1.5, 2.0), with B = 100 SIMEX simulated data sets for each λ.Both the quadratic and the cubic extrapolant were considered.For purposes of mean squared error analysis, we evaluated the estimated function  on an equally spaced grid from 0.10 to π − 0.10.Nonparametric regression was performed using the Epanechnikov kernel function.There were 200 simulated data sets.
For bandwidth selection, we used two algorithms.First, at each calculation of the nonparametric function for fixed (β 1 , β 2 ), we used the DPI bandwidth selection method of Ruppert, Sheather and Wand (1995).We also did a separate analysis with the bandwidth fixed globally at σ w n −1/5 , where σ w is the standard deviation of the observed error-prone predictors W .The results were in rough agreement, with the former reported here.
Because in this simulation X and the measurement error U are normally distributed, and X is independent of Z, it follows that the distribution of X given (Z, W ) is also normal, i.e., Normal(α 0 + α 1 W, σ 2 x|w ), where with a = 1/1.16,α 0 = −(1 − a), α 1 = a and σ 2 x|w = 0.16a, see [4](Chapter 3).It then follows that the observed data in (W, Z) have the mean . Thus, the naive estimate that ignores measurement error should be biased downward by the amount 0.22 for β 1 and by 0.18 for β 2 .The bias in the estimated function is the constant β 2 σ 2 x|w = 0.10.Table 1 basically confirms what the theory says.The simulated mean bias of the naive estimator for both β 1 and β 2 is very nearly the large sample value, and both the quadratic and cubic extrapolants largely remove this bias.With n = 500, this translates into a sizeable gain in mean squared error efficiency.For estimating θ(•), there is little bias even theoretically in the naive estimator, relative to the variability of nonparametric regression kernel estimators, and we see little difference between the naive and SIMEX estimators.

Introduction
An important example of the types of semiparametric problems that SIMEX is able to address is the following.In the 1950's, the United States conducted above-ground nuclear testing, and in the 1980's the University of Utah conducted the Nevada Test Site (NTS) Thyroid Disease Study.In the Nevada study, 2, 491 individuals who were exposed to radiation as children were examined for thyroid disease.The primary radiation exposure to the thyroid glands of these children came from the ingestion of milk and vegetables contaminated with radioactive isotopes of iodine.The idea of the study was to relate various thyroid disease outcomes to radiation exposure to the thyroid.The original version of this study was described by [15,39] and [34].Recently, the dosimetry for the study was redone [35], and the study results were reported [22].
The estimation of individual radiation dose that occurred 50 years in the past is well-known to be subject to large uncertainties, especially when mathematical models are employed due to the absence of direct measurements of the concentrations of radioactivity in foods or within the thyroid gland of individuals.There are many references on this subject, a good introduction to which is given by [29] and the many statistical papers in that volume.Various statistical papers [21,24,26,28,32,33,40] describe measurement error properties and analysis in this context.What is typical in these studies is to build a large dosimetry model that attempts to convert the known data, e.g., about the above-ground nuclear tests, to radiation actually absorbed into the thyroid.Dosimetry calculations for individual subjects were based on age at exposure, gender, residence history, whether as a child the individual was breast-fed, and a diet questionnaire filled out by the parent focusing on milk consumption and vegetables.The data were then input into a complex model and for each individual, the point estimate of thyroid dose (the arithmetic mean of a lognormal distribution of dose estimates) and an associated measurement error term (the geometric standard deviation) were reported.
Generally, authors engaged in dose reconstruction using mathematical models conclude that radiation doses are estimated with a combination of Berkson measurement error and the classical type of measurement error.This type of model, see [28], in the log-scale of dose says that true log-dose T is related to observed or calculated log-dose W by a latent intermediate X via where U berk is the Berkson uncertainty with variance σ 2 u,berk depending on the individual, and U class is the classical uncertainty with variance σ 2 u,class depending on the individual.In the NTS data, the total uncertainty σ 2 u,class + σ 2 u,berk is known, but not the relative contributions.Various of the references given above do sensitivity analysis with different amounts of relative contributions: in our illustration, we will consider two situations, one with 50% of the total uncertainty being Berkson and the other with all the uncertainty being classical.
It is typical to assume that the Berkson error U berk is normally distributed.If X could be observed, it is typical to use the total mean dose exp(X + σ 2 u,berk /2) as the main predictor of risk, and we take this as the target.
Let Y , the response, be the incidence of thyroiditis (inflammation of the thyroid gland), and consider in addition a variable measured without error, Z, the sex of the patient.A typical model relating total mean dose and gender to disease is the excess relative risk model where H(•) is the logistic distribution function and γ is called the excess relative risk.Because the amount of Berkson uncertainty depends upon the individual, the term σ 2 u,berk /2 in the excess relative risk model may be thought of as an offset.
There is some doubt as to what the correct dose-response model should be, and indeed some researchers use linear dose-response models.It thus makes sense to consider a semiparametric model that allows the dose-response to be flexible.The semiparametric model of interest here is the partially linear logistic model: where θ(•) is an unknown function.This is an example when the variable that is known exactly, Z, is modeled parametrically but the variable measured with error, X, is modeled nonparametrically.

Data analysis
In our analysis, we fit models ( 15) and ( 16).We used quadratic extrapolation, set the λ-values as (0.0, 0.5, 1.0, 1.5, 2.0), used B = 100 SIMEX simulated data sets, and employed the Epanechnikov kernel function.We assumed 50% of the measurement error was Berkson and the rest was of the usual additive form.
We also redid the analysis with all the measurement error being of the classical form.The bandwidth chosen was 1.5, but similar results were obtained for 1.0 and 2.0.Profiling was used in the semiparametric model calculations: it was simple to implement because the nonparametric regressions done used standard software for logistic regression with weights and an offset.That is, in model ( 16), for any given β, θ(•) can be estimated by logistic regression with the offset Zβ and with the weights given by the kernel weights.
Standard errors were estimated as described in Section 4.2.1.In addition, we ran 1000 bootstrap samples, and recalculated the estimates.

Various fits
Because X and Z are essentially independent in the data, the estimate of β 1 is not much affected by the measurement error.Indeed, in either the mixture of Berkson and classical errors, or when there is no Berkson error, and using either the excess relative risk model (15), a naive semiparametric fit that ignores the measurement error, or using the SIMEX semiparametric fit, the estimate of β 1 ≈ 1.75 with a standard error (s.e.) ≈ 0.25.There were only very slight Logits of the effects of dose in the Nevada Test Site thyroiditis data.Solid lines are the fits that ignore measurement error, dashed as those that account for measurement error.Left two panels: the mixture of Berkson errors model using parametric (top) and semiparametric (bottom) fits.Right two panels: the no Berkson error model using parametric (top) and semiparametric (bottom) fits.In the bottom two panels, the dashed lines are the SIMEX fits.
differences in any of these cases.The result is plausible: β 1 indicates that women are at higher risk for thyroiditis.
Figure 3 shows the results of various fitting methods.In all cases, the solid lines are the fits that ignore measurement error, while the dashed lines account for measurement error.The left two panels give results for the mixture of Berkson errors model using parametric (top) and semiparametric (bottom) fits.The right two panels give fits when no Berkson errors is assumed using parametric (top) and semiparametric (bottom) fits.In the bottom two panels, the dashed lines are the SIMEX fits.The overall conclusion is that there is some change in the logits when the classical error is accounted for, and if all the error is classical, a substantial change is seen.The excess relative risk parameter estimates for the naive and measurement error analysis are, respectively, 6.24 and 7.75 for the mixture error model, and 7.83 and 12.60 for the no Berkson error model.

Standard Errors
Figures 4 and 5 show the power of our asymptotic theory for estimating standard errors.All analyses concern the mixture of Berkson and classical errors.
In Figure 4, we consider the naive semiparametric fits, i.e., s 1 = 1 and s j = 0 for j > 1.The solid line is the usual estimated pointwise standard errors that ignore the estimation of β 1 : these estimates ignore the second term in (10).The dashed line is our new method based upon equation (10), while the dotdashed line is the pointwise standard error from 1000 bootstrap simulations.The result is clear: ignoring the estimation of β 1 produces standard errors for the nonparametric fit that are badly biased.9).Dashed line: our new method for estimating standard errors as in equation (10).Dot-dashed line: bootstrap standard errors based on 1,000 bootstrap replicates.Dotted line: pointwise standard errors when the variability of the parametric part is not taken into account.
In Figure 5, we consider the SIMEX semiparametric fits.The solid line is the usual estimated pointwise standard errors based upon the formal theory in Result 1 and equation (9).The dashed line is our new method based upon equation (10), while the dot-dashed line is the pointwise standard error from 1000 bootstrap simulations.Also, the dotted line is estimated pointwise standard errors when the variability of the parametric part is not taken into account.The result is also clear: the formal asymptotic theory is not very useful, and it is only our asymptotic expansions that give sensible results.

Remark 4
For SIMEX, it is worth pointing out that bootstrapping is particularly time-consuming, and thus having workable standard errors via (10) is practically very important.With B = 100 and four non-zero λ values, our standard error formulae requires B = 401 semiparametric fits.The bootstrap with 1000 bootstrap samples requires 401, 000 semiparametric fits.

Discussion
The main point of this paper is to derive the limiting distribution of SIMEX in semiparametric problems, when the variable X subject to measurement error is modeled parametrically, nonparametrically or a combination of both, and to give computable asymptotically correct standard error estimates that improve upon first-order methods in empirical applications.In the case that the variable measured with error is modeled nonparametrically, our standard error formula is new, and was vastly superior to the asymptotic formula in both simulations and in our empirical analysis.Our method of estimating standard errors has not previously been discussed in the literature, as far as we know, and was facilitated by our asymptotic expansions.
It is useful to point out that the standard error estimates are also applicable to cases that there is no measurement error.Indeed, in simulations that are the no measurement error versions of the problems in Sections 6.1 and 6.2, the same basic phenomenon happens, namely that the standard errors for the nonparametric function obtained ignoring the asymptotically negligible variability in the parametric parts considerably underestimated the actual simulated standard errors, while our methods were able to reproduce them rather well.
There are two issues that we have not addressed in detail.
• We have not discussed the issue of bandwidth selection in detail, with our numerical results using ad hoc bandwidth choices.However, based on our results, at least in principle this is straightforward.As in [38], but modified to the semiparametric case, the idea is to use our expansions and EBBS [30] to estimate the bias, while our new variance estimates of the nonparametric part given in Section 4.2.1 and 5.2.1 are used to derived more precise estimated standard errors, from which the estimated mean squared errors can then be calculated and minimized.• It is seemingly straightforward although notationally tedious and algebraically complex to extend the results of Section 4-5 to the case that the covariance matrix of the measurement errors is unknown and needs to be estimated.In the case that the mismeasured X is modeled parametrically, this is a relatively simple combination of our work and that of [3].The more difficult case occurs when the mismeasured X is modeled nonparametrically: the technical issue is that the arguments in the kernel weights will now depend on the estimated measurement error variance.

A.3.2. Semiparametric Models
We now turn to the estimation of β PF .Define
Fig 1.Results from the simulation study in Section 6.1 for the partially linear model with measurement error in the covariate modeled nonparametrically.Displayed are the pointwise estimated standard errors of SIMEX estimate of θ(x) using quadratic extrapolation.Solid line: our method as described in Section 4.2.1.Dot-dashed line: standard errors when the variability of the parametric part is not taken into account.Dashed line: pointwise standard errors of θ(z) over 1,000 simulations.
Fig 3.  Logits of the effects of dose in the Nevada Test Site thyroiditis data.Solid lines are the fits that ignore measurement error, dashed as those that account for measurement error.Left two panels: the mixture of Berkson errors model using parametric (top) and semiparametric (bottom) fits.Right two panels: the no Berkson error model using parametric (top) and semiparametric (bottom) fits.In the bottom two panels, the dashed lines are the SIMEX fits.

Fig 4 .Fig 5 .
Fig 4.  Estimated standard errors of the nonparametric fits in the in the Nevada Test Site thyroiditis data ignoring measurement error when the errors are a mixture of Berkson and Classical errors.Solid line: the ordinary estimated pointwise standard errors when the variability of the parametric part is not taken into account.Dashed line: our new method for estimating standard errors as in equation(10).Dot-dashed line: bootstrap standard errors based on 1,000 bootstrap replicates.

Table 1
Results of the simulation for the model where X is measured with error.Here Quadratic and Cubic refer to SIMEX extrapolation via the quadratic and cubic functions.MSE Efficiency is mean squared error efficiency of the methods relative to the naive method that ignores measurement error