Suppr超能文献

影响遗传关联研究中有效测试数量的因素:三种基于 PCA 的方法的比较研究。

Factors affecting the effective number of tests in genetic association studies: a comparative study of three PCA-based methods.

机构信息

Department of Public Health, College of Medicine, Tzu-Chi University, Hualien, Taiwan.

出版信息

J Hum Genet. 2011 Jun;56(6):428-35. doi: 10.1038/jhg.2011.34. Epub 2011 Mar 31.

Abstract

The number of tested marker becomes numerous in genetic association studies (GAS) and one major challenge is to derive the multiple testing threshold. Some approaches calculating an effective number (M(eff)) of tests in GAS were developed and have been shown to be promising. As yet, there have been no comparisons of their robustness to influencing factors. We evaluated the performance of three principal component analysis (PCA)-based M(eff) estimation formulas (M(eff-C) in Cheverud (2001), M(eff-L) in Li and Ji (2005), and M(eff-G) in Galwey (2009)). Four influencing factors including LD measurements, marker density, population samples and the total number of tested markers were considered. We validated them by the Bonferroni's method and the permutation test with 10 000 random shuffles based on three real data sets. For each factor, M(eff-C) yielded conservative threshold except with D' coefficient, and M(eff-G) would be too liberal compared with the permutation test. Our results indicated that M(eff-L) based on r(2) coefficient achieve close approximation of the permutation threshold. As for a large number of markers, we recommended to use M(eff-L) with r(2) coefficient according to fixed-length separation, as well as fixed-number separation, to obtain accurate estimate of the multiple testing threshold and to save more computational time.

摘要

在遗传关联研究(GAS)中,测试的标记数量众多,其中一个主要挑战是推导出多重检验阈值。已经开发了一些计算 GAS 中有效测试数量(M(eff))的方法,并且已经证明这些方法很有前途。然而,迄今为止,还没有比较它们对影响因素的稳健性。我们评估了三种基于主成分分析(PCA)的 M(eff)估计公式(Cheverud(2001)中的 M(eff-C)、Li 和 Ji(2005)中的 M(eff-L)和 Galwey(2009)中的 M(eff-G))的性能。考虑了四个影响因素,包括 LD 测量、标记密度、群体样本和测试标记的总数。我们使用 Bonferroni 方法和基于 10000 次随机打乱的置换检验对三个真实数据集进行了验证。对于每个因素,除了 D'系数外,M(eff-C)产生保守的阈值,而 M(eff-G)与置换检验相比过于宽松。我们的结果表明,基于 r(2)系数的 M(eff-L)接近置换阈值的近似值。对于大量标记,我们建议根据固定长度分离以及固定数量分离使用基于 r(2)系数的 M(eff-L),以获得多重检验阈值的准确估计并节省更多计算时间。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验