Suppr超能文献

代谢途径中频繁发生的基因融合事件的系统鉴定与分析。

Systematic identification and analysis of frequent gene fusion events in metabolic pathways.

作者信息

Henry Christopher S, Lerma-Ortiz Claudia, Gerdes Svetlana Y, Mullen Jeffrey D, Colasanti Ric, Zhukov Aleksey, Frelin Océane, Thiaville Jennifer J, Zallot Rémi, Niehaus Thomas D, Hasnain Ghulam, Conrad Neal, Hanson Andrew D, de Crécy-Lagard Valérie

机构信息

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, 60439, USA.

Computation Institute, The University of Chicago, Chicago, IL, 60637, USA.

出版信息

BMC Genomics. 2016 Jun 24;17:473. doi: 10.1186/s12864-016-2782-3.

Abstract

BACKGROUND

Gene fusions are the most powerful type of in silico-derived functional associations. However, many fusion compilations were made when <100 genomes were available, and algorithms for identifying fusions need updating to handle the current avalanche of sequenced genomes. The availability of a large fusion dataset would help probe functional associations and enable systematic analysis of where and why fusion events occur.

RESULTS

Here we present a systematic analysis of fusions in prokaryotes. We manually generated two training sets: (i) 121 fusions in the model organism Escherichia coli; (ii) 131 fusions found in B vitamin metabolism. These sets were used to develop a fusion prediction algorithm that captured the training set fusions with only 7 % false negatives and 50 % false positives, a substantial improvement over existing approaches. This algorithm was then applied to identify 3.8 million potential fusions across 11,473 genomes. The results of the analysis are available in a searchable database at http://modelseed.org/projects/fusions/ . A functional analysis identified 3,000 reactions associated with frequent fusion events and revealed areas of metabolism where fusions are particularly prevalent.

CONCLUSIONS

Customary definitions of fusions were shown to be ambiguous, and a stricter one was proposed. Exploring the genes participating in fusion events showed that they most commonly encode transporters, regulators, and metabolic enzymes. The major rationales for fusions between metabolic genes appear to be overcoming pathway bottlenecks, avoiding toxicity, controlling competing pathways, and facilitating expression and assembly of protein complexes. Finally, our fusion dataset provides powerful clues to decipher the biological activities of domains of unknown function.

摘要

背景

基因融合是计算机推导的功能关联中最强大的一种类型。然而,许多融合汇编是在可用基因组不足100个时完成的,识别融合的算法需要更新以处理当前大量的测序基因组。一个大型融合数据集的可用性将有助于探究功能关联,并能够对融合事件发生的位置和原因进行系统分析。

结果

在此,我们对原核生物中的融合进行了系统分析。我们手动生成了两个训练集:(i)模式生物大肠杆菌中的121个融合;(ii)在B族维生素代谢中发现的131个融合。这些集合用于开发一种融合预测算法,该算法捕获训练集融合时假阴性率仅为7%,假阳性率为50%,比现有方法有了显著改进。然后应用该算法在11473个基因组中识别出380万个潜在融合。分析结果可在http://modelseed.org/projects/fusions/的可搜索数据库中获取。功能分析确定了与频繁融合事件相关的3000个反应,并揭示了融合特别普遍的代谢区域。

结论

融合的传统定义被证明是模糊的,因此提出了一个更严格的定义。对参与融合事件的基因进行探索表明,它们最常编码转运蛋白、调节因子和代谢酶。代谢基因之间融合的主要原理似乎是克服途径瓶颈、避免毒性、控制竞争途径以及促进蛋白质复合物的表达和组装。最后,我们的融合数据集为破译未知功能结构域的生物学活性提供了有力线索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a02/4921024/2471a7ef5abe/12864_2016_2782_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验