Suppr超能文献

评估序列数据中罕见变异分析方法。

Evaluating methods for the analysis of rare variants in sequence data.

作者信息

Luedtke Alexander, Powers Scott, Petersen Ashley, Sitarik Alexandra, Bekmetjev Airat, Tintle Nathan L

机构信息

Division of Applied Mathematics, Brown University, 182 George Street, Providence, RI 02912, USA.

Department of Statistics and Operations Research, 318 Hanes Hall, CB 3260, University of North Carolina, Chapel Hill, NC 27599-3260, USA.

出版信息

BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S119. doi: 10.1186/1753-6561-5-S9-S119.

Abstract

A number of rare variant statistical methods have been proposed for analysis of the impending wave of next-generation sequencing data. To date, there are few direct comparisons of these methods on real sequence data. Furthermore, there is a strong need for practical advice on the proper analytic strategies for rare variant analysis. We compare four recently proposed rare variant methods (combined multivariate and collapsing, weighted sum, proportion regression, and cumulative minor allele test) on simulated phenotype and next-generation sequencing data as part of Genetic Analysis Workshop 17. Overall, we find that all analyzed methods have serious practical limitations on identifying causal genes. Specifically, no method has more than a 5% true discovery rate (percentage of truly causal genes among all those identified as significantly associated with the phenotype). Further exploration shows that all methods suffer from inflated false-positive error rates (chance that a noncausal gene will be identified as associated with the phenotype) because of population stratification and gametic phase disequilibrium between noncausal SNPs and causal SNPs. Furthermore, observed true-positive rates (chance that a truly causal gene will be identified as significantly associated with the phenotype) for each of the four methods was very low (<19%). The combination of larger than anticipated false-positive rates, low true-positive rates, and only about 1% of all genes being causal yields poor discriminatory ability for all four methods. Gametic phase disequilibrium and population stratification are important areas for further research in the analysis of rare variant data.

摘要

为了分析即将到来的新一代测序数据浪潮,人们已经提出了一些罕见变异统计方法。到目前为止,在真实序列数据上对这些方法进行的直接比较很少。此外,对于罕见变异分析的适当分析策略,非常需要实用的建议。作为遗传分析研讨会17的一部分,我们在模拟表型和新一代测序数据上比较了四种最近提出的罕见变异方法(联合多变量和压缩法、加权和法、比例回归法和累积次要等位基因检验法)。总体而言,我们发现所有分析方法在识别因果基因方面都存在严重的实际局限性。具体来说,没有一种方法的真发现率超过5%(在所有被确定与表型显著相关的基因中,真正因果基因的百分比)。进一步的探索表明,由于群体分层以及非因果单核苷酸多态性(SNP)与因果SNP之间的配子相位不平衡,所有方法都存在虚高的假阳性错误率(非因果基因被确定与表型相关的概率)。此外,这四种方法各自的观察到的真阳性率(真正因果基因被确定与表型显著相关的概率)非常低(<19%)。高于预期的假阳性率、低真阳性率以及所有基因中只有约1%是因果基因的情况相结合,导致这四种方法的鉴别能力都很差。配子相位不平衡和群体分层是罕见变异数据分析中有待进一步研究的重要领域。

相似文献

1
Evaluating methods for the analysis of rare variants in sequence data.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S119. doi: 10.1186/1753-6561-5-S9-S119.
2
Application of collapsing methods for continuous traits to the Genetic Analysis Workshop 17 exome sequence data.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S121. doi: 10.1186/1753-6561-5-S9-S121.
3
Identification of genetic association of multiple rare variants using collapsing methods.
Genet Epidemiol. 2011;35 Suppl 1(Suppl 1):S101-6. doi: 10.1002/gepi.20658.
6
Evaluating methods for combining rare variant data in pathway-based tests of genetic association.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S48. doi: 10.1186/1753-6561-5-S9-S48.
7
Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S12. doi: 10.1186/1753-6561-5-S9-S12.
8
Collapsing-based and kernel-based single-gene analyses applied to Genetic Analysis Workshop 17 mini-exome data.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S117. doi: 10.1186/1753-6561-5-S9-S117. eCollection 2011.
9
Comparison of scoring methods for the detection of causal genes with or without rare variants.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S49. doi: 10.1186/1753-6561-5-S9-S49.
10
Effect of population stratification analysis on false-positive rates for common and rare variants.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S116. doi: 10.1186/1753-6561-5-S9-S116.

引用本文的文献

1
Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements.
Cell Mol Life Sci. 2024 Jun 20;81(1):274. doi: 10.1007/s00018-024-05314-z.
4
Pathway analysis with next-generation sequencing data.
Eur J Hum Genet. 2015 Apr;23(4):507-15. doi: 10.1038/ejhg.2014.121. Epub 2014 Jul 2.
5
A method to incorporate prior information into score test for genetic association studies.
BMC Bioinformatics. 2014 Jan 22;15:24. doi: 10.1186/1471-2105-15-24.
6
VarBin, a novel method for classifying true and false positive variants in NGS data.
BMC Bioinformatics. 2013;14 Suppl 13(Suppl 13):S2. doi: 10.1186/1471-2105-14-S13-S2. Epub 2013 Oct 1.
7
A geometric framework for evaluating rare variant tests of association.
Genet Epidemiol. 2013 May;37(4):345-57. doi: 10.1002/gepi.21722. Epub 2013 Mar 21.
8
Assessing the impact of differential genotyping errors on rare variant tests of association.
PLoS One. 2013;8(3):e56626. doi: 10.1371/journal.pone.0056626. Epub 2013 Mar 5.
9
Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation.
Eur J Hum Genet. 2013 Jun;21(6):680-6. doi: 10.1038/ejhg.2012.220. Epub 2012 Oct 24.
10
Digging into the extremes: a useful approach for the analysis of rare variants with continuous traits?
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S105. doi: 10.1186/1753-6561-5-S9-S105.

本文引用的文献

1
Genetic Analysis Workshop 17 mini-exome simulation.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S2. doi: 10.1186/1753-6561-5-S9-S2.
2
Statistical analysis of rare sequence variants: an overview of collapsing methods.
Genet Epidemiol. 2011;35 Suppl 1(Suppl 1):S12-7. doi: 10.1002/gepi.20643.
3
Extending rare-variant testing strategies: analysis of noncoding sequence and imputed genotypes.
Am J Hum Genet. 2010 Nov 12;87(5):604-17. doi: 10.1016/j.ajhg.2010.10.012.
4
An evaluation of statistical approaches to rare variant analysis in genetic association studies.
Genet Epidemiol. 2010 Feb;34(2):188-93. doi: 10.1002/gepi.20450.
5
A groupwise association test for rare mutations using a weighted sum statistic.
PLoS Genet. 2009 Feb;5(2):e1000384. doi: 10.1371/journal.pgen.1000384. Epub 2009 Feb 13.
6
Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data.
Am J Hum Genet. 2008 Sep;83(3):311-21. doi: 10.1016/j.ajhg.2008.06.024. Epub 2008 Aug 7.
7
On measures of gametic disequilibrium.
Genetics. 1988 Nov;120(3):849-52. doi: 10.1093/genetics/120.3.849.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验