Göring H H, Terwilliger J D
Department of Genetics and Development, Columbia University, New York, NY, USA.
Am J Hum Genet. 2000 Apr;66(4):1298-309. doi: 10.1086/302846. Epub 2000 Mar 23.
In linkage and linkage disequilibrium (LD) analysis of complex multifactorial phenotypes, various types of errors can greatly reduce the chance of successful gene localization. The power of such studies-even in the absence of errors-is quite low, and, accordingly, their robustness to errors can be poor, especially in multipoint analysis. For this reason, it is important to deal with the ramifications of errors up front, as part of the analytical strategy. In this study, errors in the characterization of marker-locus parameters-including allele frequencies, haplotype frequencies (i.e., LD between marker loci), recombination fractions, and locus order-are dealt with through the use of profile likelihoods maximized over such nuisance parameters. It is shown that the common practice of assuming fixed, erroneous values for such parameters can reduce the power and/or increase the probability of obtaining false positive results in a study. The effects of errors in assumed parameter values are generally more severe when a larger number of less informative marker loci, like the highly-touted single nucleotide polymorphisms (SNPs), are analyzed jointly than when fewer but more informative marker loci, such as microsatellites, are used. Rather than fixing inaccurate values for these parameters a priori, we propose to treat them as nuisance parameters through the use of profile likelihoods. It is demonstrated that the power of linkage and/or LD analysis can be increased through application of this technique in situations where parameter values cannot be specified with a high degree of certainty.
在复杂多因素性状的连锁和连锁不平衡(LD)分析中,各种类型的误差会大大降低成功进行基因定位的机会。这类研究的效能——即使在没有误差的情况下——也相当低,因此,它们对误差的稳健性可能较差,尤其是在多点分析中。出于这个原因,作为分析策略的一部分,预先处理误差的影响很重要。在本研究中,通过使用针对此类干扰参数最大化的轮廓似然法来处理标记位点参数表征中的误差,这些参数包括等位基因频率、单倍型频率(即标记位点之间的LD)、重组率和位点顺序。结果表明,在研究中对这些参数假设固定的错误值的常见做法会降低效能和/或增加获得假阳性结果的概率。当联合分析大量信息较少的标记位点(如备受推崇的单核苷酸多态性(SNP))时,假设参数值中的误差影响通常比使用较少但信息较多的标记位点(如微卫星)时更为严重。我们建议不要先验地为这些参数设定不准确的值,而是通过使用轮廓似然法将它们视为干扰参数。结果表明,在无法高度确定地指定参数值时,通过应用该技术可以提高连锁和/或LD分析的效能。