Suppr超能文献

全基因组序列数据中多个罕见变异的合并方法。

Methods for collapsing multiple rare variants in whole-genome sequence data.

作者信息

Sung Yun Ju, Korthauer Keegan D, Swartz Michael D, Engelman Corinne D

机构信息

Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri, United States of America.

出版信息

Genet Epidemiol. 2014 Sep;38 Suppl 1(0 1):S13-20. doi: 10.1002/gepi.21820.

Abstract

Genetic Analysis Workshop 18 provided whole-genome sequence data in a pedigree-based sample and longitudinal phenotype data for hypertension and related traits, presenting an excellent opportunity for evaluating analysis choices. We summarize the nine contributions to the working group on collapsing methods, which evaluated various approaches for the analysis of multiple rare variants. One contributor defined a variant prioritization scheme, whereas the remaining eight contributors evaluated statistical methods for association analysis. Six contributors chose the gene as the genomic region for collapsing variants, whereas three contributors chose nonoverlapping sliding windows across the entire genome. Statistical methods spanned most of the published methods, including well-established burden tests, variance-components-type tests, and recently developed hybrid approaches. Lesser known methods, such as functional principal components analysis, higher criticism, and homozygosity association, and some newly introduced methods were also used. We found that performance of these methods depended on the characteristics of the genomic region, such as effect size and direction of variants under consideration. Except for MAP4 and FLT3, the performance of all statistical methods to identify rare casual variants was disappointingly poor, providing overall power almost identical to the type I error. This poor performance may have arisen from a combination of (1) small sample size, (2) small effects of most of the causal variants, explaining a small fraction of variance, (3) use of incomplete annotation information, and (4) linkage disequilibrium between causal variants in a gene and noncausal variants in nearby genes. Our findings demonstrate challenges in analyzing rare variants identified from sequence data.

摘要

遗传分析研讨会18提供了基于家系样本的全基因组序列数据以及高血压和相关性状的纵向表型数据,为评估分析选择提供了绝佳机会。我们总结了对折叠方法工作组的九项贡献,该工作组评估了多种分析多个罕见变异的方法。一位贡献者定义了变异优先级方案,而其余八位贡献者评估了关联分析的统计方法。六位贡献者选择基因作为折叠变异的基因组区域,而三位贡献者选择了覆盖整个基因组的非重叠滑动窗口。统计方法涵盖了大多数已发表的方法,包括成熟的负担检验、方差成分类型检验以及最近开发的混合方法。还使用了鲜为人知的方法,如功能主成分分析、高等批评和纯合性关联,以及一些新引入的方法。我们发现这些方法的性能取决于基因组区域的特征,如所考虑变异的效应大小和方向。除了MAP4和FLT3,所有识别罕见因果变异的统计方法的性能都差得令人失望,提供的总体效能几乎与I型错误相同。这种不佳的性能可能是由以下因素共同导致的:(1)样本量小;(2)大多数因果变异的效应小,解释的方差比例小;(3)使用不完整的注释信息;(4)基因中的因果变异与附近基因中的非因果变异之间的连锁不平衡。我们的研究结果表明了在分析从序列数据中识别出的罕见变异时所面临的挑战。

相似文献

引用本文的文献

9
The Increasing Importance of Gene-Based Analyses.基于基因的分析的重要性日益增加。
PLoS Genet. 2016 Apr 7;12(4):e1005852. doi: 10.1371/journal.pgen.1005852. eCollection 2016 Apr.

本文引用的文献

2
A comparison of two collapsing methods in different approaches.不同方法中两种折叠方法的比较。
BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S8. doi: 10.1186/1753-6561-8-S1-S8. eCollection 2014.
4
Analysis of homozygosity disequilibrium using whole-genome sequencing data.利用全基因组测序数据进行纯合性不平衡分析。
BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S15. doi: 10.1186/1753-6561-8-S1-S15. eCollection 2014.
5
Higher criticism approach to detect rare variants using whole genome sequencing data.使用全基因组测序数据检测罕见变异的高级批判方法。
BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S14. doi: 10.1186/1753-6561-8-S1-S14. eCollection 2014.
6
Small sample properties of rare variant analysis methods.罕见变异分析方法的小样本属性。
BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S13. doi: 10.1186/1753-6561-8-S1-S13. eCollection 2014.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验