等位基因频率误判：随机抽样下数量性状模型依赖连锁分析的功效及I型错误的影响

Allele frequency misspecification: effect on power and Type I error of model-dependent linkage analysis of quantitative traits under random ascertainment.

作者信息

Mandal Diptasri M, Sorant Alexa J M, Atwood Larry D, Wilson Alexander F, Bailey-Wilson Joan E

机构信息

Department of Genetics, Louisiana State University Health Sciences Center, CSRB 6-16, New Orleans, LA 70112, USA.

出版信息

BMC Genet. 2006 Apr 20;7:21. doi: 10.1186/1471-2156-7-21.

DOI:10.1186/1471-2156-7-21

PMID:16618369

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1475629/

Abstract

BACKGROUND

Studies of model-based linkage analysis show that trait or marker model misspecification leads to decreasing power or increasing Type I error rate. An increase in Type I error rate is seen when marker related parameters (e.g., allele frequencies) are misspecified and ascertainment is through the trait, but lod-score methods are expected to be robust when ascertainment is random (as is often the case in linkage studies of quantitative traits). In previous studies, the power of lod-score linkage analysis using the "correct" generating model for the trait was found to increase when the marker allele frequencies were misspecified and parental data were missing. An investigation of Type I error rates, conducted in the absence of parental genotype data and with misspecification of marker allele frequencies, showed that an inflation in Type I error rate was the cause of at least part of this apparent increased power. To investigate whether the observed inflation in Type I error rate in model-based LOD score linkage was due to sampling variation, the trait model was estimated from each sample using REGCHUNT, an automated segregation analysis program used to fit models by maximum likelihood using many different sets of initial parameter estimates.

RESULTS

The Type I error rates observed using the trait models generated by REGCHUNT were usually closer to the nominal levels than those obtained when assuming the generating trait model.

CONCLUSION

This suggests that the observed inflation of Type I error upon misspecification of marker allele frequencies is at least partially due to sampling variation. Thus, with missing parental genotype data, lod-score linkage is not as robust to misspecification of marker allele frequencies as has been commonly thought.

摘要

背景

基于模型的连锁分析研究表明，性状或标记模型设定错误会导致检验效能降低或I型错误率增加。当标记相关参数（如等位基因频率）设定错误且通过性状进行确定时，会出现I型错误率增加的情况，但当确定是随机的（如在数量性状连锁研究中经常出现的情况）时，预计对数计分法具有稳健性。在先前的研究中，当标记等位基因频率设定错误且亲本数据缺失时，使用性状的“正确”生成模型进行对数计分连锁分析的检验效能会增加。在没有亲本基因型数据且标记等位基因频率设定错误的情况下进行的I型错误率调查表明，I型错误率的膨胀是这种明显增加的检验效能的至少部分原因。为了研究基于模型的对数计分连锁中观察到的I型错误率膨胀是否是由于抽样变异，使用REGCHUNT从每个样本中估计性状模型，REGCHUNT是一个自动分离分析程序，用于通过最大似然法使用许多不同的初始参数估计集来拟合模型。