• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整合注释之间的传递关系检测基因注释和蛋白质-蛋白质相互作用相关疾病。

Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations.

作者信息

Masseroli Marco, Canakoglu Arif, Quigliatti Massimiliano

出版信息

BMC Genomics. 2015;16(Suppl 6):S5. doi: 10.1186/1471-2164-16-S6-S5. Epub 2015 Jun 1.

DOI:10.1186/1471-2164-16-S6-S5
PMID:26046679
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4460591/
Abstract

BACKGROUND

Increasingly high amounts of heterogeneous and valuable controlled biomolecular annotations are available, but far from exhaustive and scattered in many databases. Several annotation integration and prediction approaches have been proposed, but these issues are still unsolved. We previously created a Genomic and Proteomic Knowledge Base (GPKB) that efficiently integrates many distributed biomolecular annotation and interaction data of several organisms, including 32,956,102 gene annotations, 273,522,470 protein annotations and 277,095 protein-protein interactions (PPIs).

RESULTS

By comprehensively leveraging transitive relationships defined by the numerous association data integrated in GPKB, we developed a software procedure that effectively detects and supplement consistent biomolecular annotations not present in the integrated sources. According to some defined logic rules, it does so only when the semantic type of data and of their relationships, as well as the cardinality of the relationships, allow identifying molecular biology compliant annotations. Thanks to controlled consistency and quality enforced on data integrated in GPKB, and to the procedures used to avoid error propagation during their automatic processing, we could reliably identify many annotations, which we integrated in GPKB. They comprise 3,144 gene to pathway and 21,942 gene to biological function annotations of many organisms, and 1,027 candidate associations between 317 genetic disorders and 782 human PPIs. Overall estimated recall and precision of our approach were 90.56 % and 96.61 %, respectively. Co-functional evaluation of genes with known function showed high functional similarity between genes with new detected and known annotation to the same pathway; considering also the new detected gene functional annotations enhanced such functional similarity, which resembled the one existing between genes known to be annotated to the same pathway. Strong evidence was also found in the literature for the candidate associations detected between Cystic fibrosis disorder and the PPIs between the CFTR_HUMAN, DERL1_HUMAN, RNF5_HUMAN, AHSA1_HUMAN and GOPC_HUMAN proteins, and between the CHIP_HUMAN and HSP7C_HUMAN proteins.

CONCLUSIONS

Although identified gene annotations and PPI-genetic disorder candidate associations require biological validation, our approach intrinsically provides their in silico evidence based on available data. Public availability within the GPKB (http://www.bioinformatics.deib.polimi.it/GPKB/) of all identified and integrated annotations offers a valuable resource fostering new biomedical-molecular knowledge discoveries.

摘要

背景

目前已有越来越多的异构且有价值的受控生物分子注释,但远未详尽且分散于许多数据库中。虽然已经提出了几种注释整合和预测方法,但这些问题仍未得到解决。我们之前创建了一个基因组和蛋白质组知识库(GPKB),它有效地整合了多种生物体的许多分布式生物分子注释和相互作用数据,包括32,956,102个基因注释、273,522,470个蛋白质注释以及277,095个蛋白质 - 蛋白质相互作用(PPI)。

结果

通过全面利用GPKB中整合的大量关联数据所定义的传递关系,我们开发了一种软件程序,该程序能够有效地检测和补充整合源中不存在的一致生物分子注释。根据一些定义的逻辑规则,只有当数据及其关系的语义类型以及关系的基数允许识别符合分子生物学的注释时,才会进行此操作。由于对GPKB中整合的数据强制实施了受控的一致性和质量控制,并且采用了在自动处理过程中避免错误传播的程序,我们能够可靠地识别许多注释,并将它们整合到GPKB中。这些注释包括多种生物体的3,144个基因到通路以及21,942个基因到生物学功能的注释,以及317种遗传疾病与782个人类PPI之间的1,027个候选关联。我们方法的总体估计召回率和精确率分别为90.56%和96.61%。对具有已知功能的基因进行共功能评估表明,新检测到的与已知注释到同一通路的基因之间具有高度的功能相似性;考虑到新检测到的基因功能注释,这种功能相似性进一步增强,类似于已知注释到同一通路的基因之间存在的相似性。在文献中也发现了有力证据,证明囊性纤维化疾病与CFTR_HUMAN、DERL1_HUMAN、RNF5_HUMAN、AHSA1_HUMAN和GOPC_HUMAN蛋白质之间以及CHIP_HUMAN和HSP7C_HUMAN蛋白质之间的PPI之间存在候选关联。

结论

虽然所识别的基因注释和PPI - 遗传疾病候选关联需要生物学验证,但我们的方法本质上基于现有数据为它们提供了计算机模拟证据。在GPKB(http://www.bioinformatics.deib.polimi.it/GPKB/)中公开所有已识别和整合的注释,提供了一个宝贵的资源,有助于促进新的生物医学 - 分子知识发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e18b/4460591/35df1dd200ac/1471-2164-16-S6-S5-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e18b/4460591/35df1dd200ac/1471-2164-16-S6-S5-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e18b/4460591/35df1dd200ac/1471-2164-16-S6-S5-5.jpg

相似文献

1
Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations.通过整合注释之间的传递关系检测基因注释和蛋白质-蛋白质相互作用相关疾病。
BMC Genomics. 2015;16(Suppl 6):S5. doi: 10.1186/1471-2164-16-S6-S5. Epub 2015 Jun 1.
2
Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction.用于生物医学知识提取的基因组和蛋白质组语义注释的整合与查询
IEEE/ACM Trans Comput Biol Bioinform. 2016 Mar-Apr;13(2):209-19. doi: 10.1109/TCBB.2015.2453944.
3
Software Suite for Gene and Protein Annotation Prediction and Similarity Search.用于基因和蛋白质注释预测及相似性搜索的软件套件。
IEEE/ACM Trans Comput Biol Bioinform. 2015 Jul-Aug;12(4):837-43. doi: 10.1109/TCBB.2014.2382127.
4
Quality controls in integrative approaches to detect errors and inconsistencies in biological databases.整合方法中的质量控制,用于检测生物数据库中的错误和不一致性。
J Integr Bioinform. 2010 Mar 25;7(3):454. doi: 10.2390/biecoll-jib-2010-119.
5
Cross-organism learning method to discover new gene functionalities.跨生物学习方法发现新基因功能。
Comput Methods Programs Biomed. 2016 Apr;126:20-34. doi: 10.1016/j.cmpb.2015.12.002. Epub 2015 Dec 17.
6
GeneTools--application for functional annotation and statistical hypothesis testing.基因工具——用于功能注释和统计假设检验的应用程序。
BMC Bioinformatics. 2006 Oct 24;7:470. doi: 10.1186/1471-2105-7-470.
7
MILANO--custom annotation of microarray results using automatic literature searches.米兰——使用自动文献检索对微阵列结果进行定制注释。
BMC Bioinformatics. 2005 Jan 20;6:12. doi: 10.1186/1471-2105-6-12.
8
Integration of anatomy ontology data with protein-protein interaction networks improves the candidate gene prediction accuracy for anatomical entities.解剖学本体数据与蛋白质-蛋白质相互作用网络的整合提高了解剖实体候选基因预测的准确性。
BMC Bioinformatics. 2020 Oct 7;21(1):442. doi: 10.1186/s12859-020-03773-2.
9
GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining.GFINDer:通过动态注释、统计分析和挖掘实现的基因组功能综合发现工具。
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W293-300. doi: 10.1093/nar/gkh432.
10
Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome.基因组注释转移工具(GATU):利用密切相关的参考基因组对病毒基因组进行快速注释。
BMC Genomics. 2006 Jun 13;7:150. doi: 10.1186/1471-2164-7-150.

引用本文的文献

1
Preface: BITS2014, the annual meeting of the Italian Society of Bioinformatics.前言:BITS2014,意大利生物信息学学会年会。
BMC Bioinformatics. 2015;16 Suppl 9(Suppl 9):S1. doi: 10.1186/1471-2105-16-S9-S1. Epub 2015 Jun 1.

本文引用的文献

1
Explorative search of distributed bio-data to answer complex biomedical questions.探索性搜索分布式生物数据以回答复杂的生物医学问题。
BMC Bioinformatics. 2014;15 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2105-15-S1-S3. Epub 2014 Jan 10.
2
A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity.基于拓扑相似性重建蛋白质-蛋白质相互作用网络的新链接预测算法。
Bioinformatics. 2013 Feb 1;29(3):355-64. doi: 10.1093/bioinformatics/bts688. Epub 2012 Dec 11.
3
Quality of computationally inferred gene ontology annotations.
计算推断的基因本体论注释的质量。
PLoS Comput Biol. 2012 May;8(5):e1002533. doi: 10.1371/journal.pcbi.1002533. Epub 2012 May 31.
4
CFTR anion channel modulates expression of human transmembrane mucin MUC3 through the PDZ protein GOPC.囊性纤维化跨膜电导调节因子阴离子通道通过 PDZ 蛋白 GOPC 调节人跨膜粘蛋白 MUC3 的表达。
J Cell Sci. 2011 Sep 15;124(Pt 18):3074-83. doi: 10.1242/jcs.076943. Epub 2011 Aug 18.
5
Linked open drug data for pharmaceutical research and development.链接开放药物数据在药物研发中的应用。
J Cheminform. 2011 May 16;3(1):19. doi: 10.1186/1758-2946-3-19.
6
Quality controls in integrative approaches to detect errors and inconsistencies in biological databases.整合方法中的质量控制,用于检测生物数据库中的错误和不一致性。
J Integr Bioinform. 2010 Mar 25;7(3):454. doi: 10.2390/biecoll-jib-2010-119.
7
Biological and structural basis for Aha1 regulation of Hsp90 ATPase activity in maintaining proteostasis in the human disease cystic fibrosis.Aha1 调控 Hsp90 ATP 酶活性以维持人类疾病囊性纤维化中蛋白质平衡的生物学和结构基础。
Mol Biol Cell. 2010 Mar 15;21(6):871-84. doi: 10.1091/mbc.e09-12-1017. Epub 2010 Jan 20.
8
Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.公共数据库中的注释错误:酶超家族中分子功能的错误注释。
PLoS Comput Biol. 2009 Dec;5(12):e1000605. doi: 10.1371/journal.pcbi.1000605. Epub 2009 Dec 11.
9
A network medicine approach to human disease.一种针对人类疾病的网络医学方法。
FEBS Lett. 2009 Jun 5;583(11):1759-65. doi: 10.1016/j.febslet.2009.03.001. Epub 2009 Mar 6.
10
Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach.探索全基因组蛋白质功能注释中的不一致性:一种机器学习方法。
BMC Bioinformatics. 2007 Aug 3;8:284. doi: 10.1186/1471-2105-8-284.