Suppr超能文献

针对任意深度测序数据的基于似然性的复杂性状关联测试。

Likelihood-based complex trait association testing for arbitrary depth sequencing data.

作者信息

Yan Song, Yuan Shuai, Xu Zheng, Zhang Baqun, Zhang Bo, Kang Guolian, Byrnes Andrea, Li Yun

机构信息

Department of Biostatistics, Department of Genetics, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599 USA, Merck Research Laboratories, North Wales, PA, USA, School of Statistics, Renmin University of China, Beijing, People's Republic of China, Department of Statistics, North Carolina State University, Raleigh, NC, 27607 USA, Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105, USA and Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA Department of Biostatistics, Department of Genetics, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599 USA, Merck Research Laboratories, North Wales, PA, USA, School of Statistics, Renmin University of China, Beijing, People's Republic of China, Department of Statistics, North Carolina State University, Raleigh, NC, 27607 USA, Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105, USA and Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA Department of Biostatistics, Department of Genetics, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599 USA, Merck Research Laboratories, North Wales, PA, USA, School of Statistics, Renmin University of China, Beijing, People's Republic of China, Department of Statistics, North Carolina State University, Raleigh, NC, 27607 USA, Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105, USA and Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA.

Department of Biostatistics, Department of Genetics, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599 USA, Merck Research Laboratories, North Wales, PA, USA, School of Statistics, Renmin University of China, Beijing, People's Republic of China, Department of Statistics, North Carolina State University, Raleigh, NC, 27607 USA, Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105, USA and Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA.

出版信息

Bioinformatics. 2015 Sep 15;31(18):2955-62. doi: 10.1093/bioinformatics/btv307. Epub 2015 May 14.

Abstract

UNLABELLED

In next generation sequencing (NGS)-based genetic studies, researchers typically perform genotype calling first and then apply standard genotype-based methods for association testing. However, such a two-step approach ignores genotype calling uncertainty in the association testing step and may incur power loss and/or inflated type-I error. In the recent literature, a few robust and efficient likelihood based methods including both likelihood ratio test (LRT) and score test have been proposed to carry out association testing without intermediate genotype calling. These methods take genotype calling uncertainty into account by directly incorporating genotype likelihood function (GLF) of NGS data into association analysis. However, existing LRT methods are computationally demanding or do not allow covariate adjustment; while existing score tests are not applicable to markers with low minor allele frequency (MAF). We provide an LRT allowing flexible covariate adjustment, develop a statistically more powerful score test and propose a combination strategy (UNC combo) to leverage the advantages of both tests. We have carried out extensive simulations to evaluate the performance of our proposed LRT and score test. Simulations and real data analysis demonstrate the advantages of our proposed combination strategy: it offers a satisfactory trade-off in terms of computational efficiency, applicability (accommodating both common variants and variants with low MAF) and statistical power, particularly for the analysis of quantitative trait where the power gain can be up to ∼60% when the causal variant is of low frequency (MAF < 0.01).

AVAILABILITY AND IMPLEMENTATION

UNC combo and the associated R files, including documentation, examples, are available at http://www.unc.edu/∼yunmli/UNCcombo/

CONTACT

yunli@med.unc.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

未标注

在基于下一代测序(NGS)的基因研究中,研究人员通常首先进行基因型分型,然后应用基于标准基因型的方法进行关联测试。然而,这种两步法在关联测试步骤中忽略了基因型分型的不确定性,可能会导致功效损失和/或第一类错误膨胀。在最近的文献中,已经提出了一些稳健且高效的基于似然性的方法,包括似然比检验(LRT)和得分检验,以在不进行中间基因型分型的情况下进行关联测试。这些方法通过将NGS数据的基因型似然函数(GLF)直接纳入关联分析来考虑基因型分型的不确定性。然而,现有的LRT方法计算量很大,或者不允许进行协变量调整;而现有的得分检验不适用于低频小等位基因频率(MAF)的标记。我们提供了一种允许灵活进行协变量调整的LRT,开发了一种在统计上更强大的得分检验,并提出了一种组合策略(UNC组合)以利用两种检验的优势。我们进行了广泛的模拟以评估我们提出的LRT和得分检验的性能。模拟和实际数据分析证明了我们提出的组合策略的优势:它在计算效率、适用性(适用于常见变异和低频MAF的变异)和统计功效方面提供了令人满意的权衡,特别是对于数量性状的分析,当因果变异频率较低(MAF < 0.01)时,功效增益可达约60%。

可用性和实现

UNC组合及相关的R文件,包括文档、示例,可在http://www.unc.edu/∼yunmli/UNCcombo/获取。

联系方式

yunli@med.unc.edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

本文引用的文献

9
The Genome of the Netherlands: design, and project goals.荷兰基因组计划:设计与项目目标。
Eur J Hum Genet. 2014 Feb;22(2):221-7. doi: 10.1038/ejhg.2013.118. Epub 2013 May 29.
10
AbCD: arbitrary coverage design for sequencing-based genetic studies.AbCD:基于测序的遗传研究的任意覆盖设计。
Bioinformatics. 2013 Mar 15;29(6):799-801. doi: 10.1093/bioinformatics/btt041. Epub 2013 Jan 28.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验