Suppr超能文献

通过整合注释之间的传递关系检测基因注释和蛋白质-蛋白质相互作用相关疾病。

Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations.

作者信息

Masseroli Marco, Canakoglu Arif, Quigliatti Massimiliano

出版信息

BMC Genomics. 2015;16(Suppl 6):S5. doi: 10.1186/1471-2164-16-S6-S5. Epub 2015 Jun 1.

Abstract

BACKGROUND

Increasingly high amounts of heterogeneous and valuable controlled biomolecular annotations are available, but far from exhaustive and scattered in many databases. Several annotation integration and prediction approaches have been proposed, but these issues are still unsolved. We previously created a Genomic and Proteomic Knowledge Base (GPKB) that efficiently integrates many distributed biomolecular annotation and interaction data of several organisms, including 32,956,102 gene annotations, 273,522,470 protein annotations and 277,095 protein-protein interactions (PPIs).

RESULTS

By comprehensively leveraging transitive relationships defined by the numerous association data integrated in GPKB, we developed a software procedure that effectively detects and supplement consistent biomolecular annotations not present in the integrated sources. According to some defined logic rules, it does so only when the semantic type of data and of their relationships, as well as the cardinality of the relationships, allow identifying molecular biology compliant annotations. Thanks to controlled consistency and quality enforced on data integrated in GPKB, and to the procedures used to avoid error propagation during their automatic processing, we could reliably identify many annotations, which we integrated in GPKB. They comprise 3,144 gene to pathway and 21,942 gene to biological function annotations of many organisms, and 1,027 candidate associations between 317 genetic disorders and 782 human PPIs. Overall estimated recall and precision of our approach were 90.56 % and 96.61 %, respectively. Co-functional evaluation of genes with known function showed high functional similarity between genes with new detected and known annotation to the same pathway; considering also the new detected gene functional annotations enhanced such functional similarity, which resembled the one existing between genes known to be annotated to the same pathway. Strong evidence was also found in the literature for the candidate associations detected between Cystic fibrosis disorder and the PPIs between the CFTR_HUMAN, DERL1_HUMAN, RNF5_HUMAN, AHSA1_HUMAN and GOPC_HUMAN proteins, and between the CHIP_HUMAN and HSP7C_HUMAN proteins.

CONCLUSIONS

Although identified gene annotations and PPI-genetic disorder candidate associations require biological validation, our approach intrinsically provides their in silico evidence based on available data. Public availability within the GPKB (http://www.bioinformatics.deib.polimi.it/GPKB/) of all identified and integrated annotations offers a valuable resource fostering new biomedical-molecular knowledge discoveries.

摘要

背景

目前已有越来越多的异构且有价值的受控生物分子注释,但远未详尽且分散于许多数据库中。虽然已经提出了几种注释整合和预测方法,但这些问题仍未得到解决。我们之前创建了一个基因组和蛋白质组知识库(GPKB),它有效地整合了多种生物体的许多分布式生物分子注释和相互作用数据,包括32,956,102个基因注释、273,522,470个蛋白质注释以及277,095个蛋白质 - 蛋白质相互作用(PPI)。

结果

通过全面利用GPKB中整合的大量关联数据所定义的传递关系,我们开发了一种软件程序,该程序能够有效地检测和补充整合源中不存在的一致生物分子注释。根据一些定义的逻辑规则,只有当数据及其关系的语义类型以及关系的基数允许识别符合分子生物学的注释时,才会进行此操作。由于对GPKB中整合的数据强制实施了受控的一致性和质量控制,并且采用了在自动处理过程中避免错误传播的程序,我们能够可靠地识别许多注释,并将它们整合到GPKB中。这些注释包括多种生物体的3,144个基因到通路以及21,942个基因到生物学功能的注释,以及317种遗传疾病与782个人类PPI之间的1,027个候选关联。我们方法的总体估计召回率和精确率分别为90.56%和96.61%。对具有已知功能的基因进行共功能评估表明,新检测到的与已知注释到同一通路的基因之间具有高度的功能相似性;考虑到新检测到的基因功能注释,这种功能相似性进一步增强,类似于已知注释到同一通路的基因之间存在的相似性。在文献中也发现了有力证据,证明囊性纤维化疾病与CFTR_HUMAN、DERL1_HUMAN、RNF5_HUMAN、AHSA1_HUMAN和GOPC_HUMAN蛋白质之间以及CHIP_HUMAN和HSP7C_HUMAN蛋白质之间的PPI之间存在候选关联。

结论

虽然所识别的基因注释和PPI - 遗传疾病候选关联需要生物学验证,但我们的方法本质上基于现有数据为它们提供了计算机模拟证据。在GPKB(http://www.bioinformatics.deib.polimi.it/GPKB/)中公开所有已识别和整合的注释,提供了一个宝贵的资源,有助于促进新的生物医学 - 分子知识发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e18b/4460591/35df1dd200ac/1471-2164-16-S6-S5-5.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验