Abstract
In genetic studies of complex diseases, the underlying mode of inheritance is often not known. Thus, the most powerful test or other optimal procedure for one model, e.g. recessive, may be quite inefficient if another model, e.g. dominant, describes the inheritance process. Rather than choose among the procedures that are optimal for a particular model, it is preferable to see a method that has high efficiency across a family of scientifically realistic models. Statisticians well recognize that this situation is analogous to the selection of an estimator of location when the form of the underlying distribution is not known. We review how the concepts and techniques in the efficiency robustness literature that are used to obtain efficiency robust estimators and rank tests can be adapted for the analysis of genetic data. In particular, several statistics have been used to test for a genetic association between a disease and a candidate allele or marker allele from data collected in case-control studies. Each of them is optimal for a specific inheritance model and we describe and compare several robust methods. The most suitable robust test depends somewhat on the range of plausible genetic models. When little is known about the inheritance process, the maximum of the optimal statistics for the extreme models and an intermediate one is usually the preferred choice. Sometimes one can eliminate a mode of inheritance, e.g. from prior studies of family pedigrees one may know whether the disease skips generations or not. If it does, the disease is much more likely to follow a recessive model than a dominant one. In that case, a simpler linear combination of the optimal tests for the extreme models can be a robust choice.
Information
Digital Object Identifier: 10.1214/074921706000000491