Suppr超能文献

利用表型连锁网络对基因变异进行无偏功能聚类。

Unbiased functional clustering of gene variants with a phenotypic-linkage network.

作者信息

Honti Frantisek, Meader Stephen, Webber Caleb

机构信息

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom.

出版信息

PLoS Comput Biol. 2014 Aug 28;10(8):e1003815. doi: 10.1371/journal.pcbi.1003815. eCollection 2014 Aug.

Abstract

Groupwise functional analysis of gene variants is becoming standard in next-generation sequencing studies. As the function of many genes is unknown and their classification to pathways is scant, functional associations between genes are often inferred from large-scale omics data. Such data types--including protein-protein interactions and gene co-expression networks--are used to examine the interrelations of the implicated genes. Statistical significance is assessed by comparing the interconnectedness of the mutated genes with that of random gene sets. However, interconnectedness can be affected by confounding bias, potentially resulting in false positive findings. We show that genes implicated through de novo sequence variants are biased in their coding-sequence length and longer genes tend to cluster together, which leads to exaggerated p-values in functional studies; we present here an integrative method that addresses these bias. To discern molecular pathways relevant to complex disease, we have inferred functional associations between human genes from diverse data types and assessed them with a novel phenotype-based method. Examining the functional association between de novo gene variants, we control for the heretofore unexplored confounding bias in coding-sequence length. We test different data types and networks and find that the disease-associated genes cluster more significantly in an integrated phenotypic-linkage network than in other gene networks. We present a tool of superior power to identify functional associations among genes mutated in the same disease even after accounting for significant sequencing study bias and demonstrate the suitability of this method to functionally cluster variant genes underlying polygenic disorders.

摘要

在下一代测序研究中,基因变异的分组功能分析正变得越来越标准化。由于许多基因的功能未知,且它们在通路中的分类很少,基因之间的功能关联通常是从大规模组学数据中推断出来的。这些数据类型——包括蛋白质-蛋白质相互作用和基因共表达网络——被用来研究相关基因的相互关系。通过将突变基因的连通性与随机基因集的连通性进行比较来评估统计显著性。然而,连通性可能会受到混杂偏差的影响,从而可能导致假阳性结果。我们表明,通过从头序列变异牵连到的基因在其编码序列长度上存在偏差,并且较长的基因倾向于聚集在一起,这导致功能研究中的p值被夸大;我们在此提出一种解决这些偏差的综合方法。为了识别与复杂疾病相关的分子通路,我们从不同的数据类型中推断出人类基因之间的功能关联,并用一种基于新表型的方法对它们进行评估。在研究从头基因变异之间的功能关联时,我们控制了编码序列长度方面迄今为止未被探索的混杂偏差。我们测试了不同的数据类型和网络,发现与疾病相关的基因在整合的表型连锁网络中比在其他基因网络中聚类更显著。我们提出了一种具有更高功效的工具,即使在考虑了显著的测序研究偏差之后,也能识别在同一种疾病中发生突变的基因之间的功能关联,并证明了该方法适用于对多基因疾病潜在的变异基因进行功能聚类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d299/4148192/29e8d743043d/pcbi.1003815.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验