• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

计算发现 GO 术语与蛋白质结构域之间的直接关联。

Computational discovery of direct associations between GO terms and protein domains.

机构信息

Université de Lorraine, CNRS, Inria, LORIA, Nancy, F-54500, France.

出版信息

BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):413. doi: 10.1186/s12859-018-2380-2.

DOI:10.1186/s12859-018-2380-2
PMID:30453875
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6245584/
Abstract

BACKGROUND

Families of related proteins and their different functions may be described systematically using common classifications and ontologies such as Pfam and GO (Gene Ontology), for example. However, many proteins consist of multiple domains, and each domain, or some combination of domains, can be responsible for a particular molecular function. Therefore, identifying which domains should be associated with a specific function is a non-trivial task.

RESULTS

We describe a general approach for the computational discovery of associations between different sets of annotations by formalising the problem as a bipartite graph enrichment problem in the setting of a tripartite graph. We call this approach "CODAC" (for COmputational Discovery of Direct Associations using Common Neighbours). As one application of this approach, we describe "GODomainMiner" for associating GO terms with protein domains. We used GODomainMiner to predict GO-domain associations between each of the 3 GO ontology namespaces (MF, BP, and CC) and the Pfam, CATH, and SCOP domain classifications. Overall, GODomainMiner yields average enrichments of 15-, 41- and 25-fold GO-domain associations compared to the existing GO annotations in these 3 domain classifications, respectively.

CONCLUSIONS

These associations could potentially be used to annotate many of the protein chains in the Protein Databank and protein sequences in UniProt whose domain composition is known but which currently lack GO annotation.

摘要

背景

可以使用通用分类法和本体论(例如 Pfam 和 GO(基因本体论))系统地描述相关蛋白家族及其不同功能。然而,许多蛋白质由多个结构域组成,每个结构域或某些结构域组合可能负责特定的分子功能。因此,确定哪些结构域应与特定功能相关联是一项非平凡的任务。

结果

我们描述了一种通过在三分图的设置中将问题形式化为二部图富集问题来计算发现不同注释集之间关联的通用方法。我们称这种方法为“CODAC”(用于使用常见邻居进行直接关联的计算发现)。作为这种方法的一种应用,我们描述了“GODomainMiner”,用于将 GO 术语与蛋白质结构域相关联。我们使用 GODomainMiner 预测了 3 个 GO 本体名称空间(MF、BP 和 CC)与 Pfam、CATH 和 SCOP 结构域分类之间的 GO-结构域关联。总体而言,与这 3 个结构域分类中的现有 GO 注释相比,GODomainMiner 分别产生了平均 15 倍、41 倍和 25 倍的 GO-结构域关联的富集。

结论

这些关联可以潜在地用于注释蛋白质数据库中的许多蛋白质链和 UniProt 中的蛋白质序列,这些链和序列已知其结构域组成,但目前缺乏 GO 注释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/40ffcb231710/12859_2018_2380_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/78c40dd6391e/12859_2018_2380_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/21f00893f6e2/12859_2018_2380_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/75fe823f175a/12859_2018_2380_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/74430c42f3c6/12859_2018_2380_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/fe3bb786c7b0/12859_2018_2380_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/40ffcb231710/12859_2018_2380_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/78c40dd6391e/12859_2018_2380_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/21f00893f6e2/12859_2018_2380_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/75fe823f175a/12859_2018_2380_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/74430c42f3c6/12859_2018_2380_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/fe3bb786c7b0/12859_2018_2380_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476c/6245584/40ffcb231710/12859_2018_2380_Fig6_HTML.jpg

相似文献

1
Computational discovery of direct associations between GO terms and protein domains.计算发现 GO 术语与蛋白质结构域之间的直接关联。
BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):413. doi: 10.1186/s12859-018-2380-2.
2
Mutual annotation-based prediction of protein domain functions with Domain2GO.基于互注释的蛋白质结构域功能预测与 Domain2GO。
Protein Sci. 2024 Jun;33(6):e4988. doi: 10.1002/pro.4988.
3
Improving automatic GO annotation with semantic similarity.利用语义相似度提高 GO 自动注释的效果。
BMC Bioinformatics. 2022 Dec 12;23(Suppl 2):433. doi: 10.1186/s12859-022-04958-7.
4
Assigning protein function from domain-function associations using DomFun.基于域-功能关联来分配蛋白质功能,使用 DomFun。
BMC Bioinformatics. 2022 Jan 15;23(1):43. doi: 10.1186/s12859-022-04565-6.
5
A domain-centric solution to functional genomics via dcGO Predictor.通过 dcGO Predictor 实现功能基因组学的以域为中心的解决方案。
BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S9. doi: 10.1186/1471-2105-14-S3-S9. Epub 2013 Feb 28.
6
TopoICSim: a new semantic similarity measure based on gene ontology.TopoICSim:一种基于基因本体论的新语义相似性度量方法。
BMC Bioinformatics. 2016 Jul 29;17(1):296. doi: 10.1186/s12859-016-1160-0.
7
IntelliGO: a new vector-based semantic similarity measure including annotation origin.IntelliGO:一种新的基于向量的语义相似性度量方法,包含注释来源。
BMC Bioinformatics. 2010 Dec 1;11:588. doi: 10.1186/1471-2105-11-588.
8
Automatic gene annotation using GO terms from cellular component domain.基于细胞成分领域的 GO 术语实现基因自动注释。
BMC Med Inform Decis Mak. 2018 Dec 7;18(Suppl 5):119. doi: 10.1186/s12911-018-0694-7.
9
The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation.PIPA的开发:一种用于全基因组蛋白质功能注释的集成自动化流程
BMC Bioinformatics. 2008 Jan 25;9:52. doi: 10.1186/1471-2105-9-52.
10
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

引用本文的文献

1
Differences in gene expression between high and low tolerance rainbow trout (Oncorhynchus mykiss) to acute thermal stress.高耐受性和低耐受性虹鳟(Oncorhynchus mykiss)对急性热应激的基因表达差异。
PLoS One. 2025 Jan 8;20(1):e0312694. doi: 10.1371/journal.pone.0312694. eCollection 2025.
2
Decoding the Functional Interactome of Non-Model Organisms with PHILHARMONIC.使用PHILHARMONIC解码非模式生物的功能相互作用组
bioRxiv. 2025 Jan 14:2024.10.25.620267. doi: 10.1101/2024.10.25.620267.
3
D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions.

本文引用的文献

1
ECDomainMiner: discovering hidden associations between enzyme commission numbers and Pfam domains.EC结构域挖掘器:发现酶委员会编号与Pfam结构域之间的隐藏关联。
BMC Bioinformatics. 2017 Feb 13;18(1):107. doi: 10.1186/s12859-017-1519-x.
2
GO annotation in InterPro: why stability does not indicate accuracy in a sea of changing annotations.InterPro中的基因本体注释:为何在不断变化的注释海洋中稳定性并不意味着准确性。
Database (Oxford). 2016 Mar 19;2016. doi: 10.1093/database/baw027. Print 2016.
3
The InterPro protein families database: the classification resource after 15 years.
D-SCRIPT 通过基于序列、结构感知的基因组规模的蛋白质-蛋白质相互作用预测,将基因组转化为表型。
Cell Syst. 2021 Oct 20;12(10):969-982.e6. doi: 10.1016/j.cels.2021.08.010. Epub 2021 Oct 9.
4
PPIDomainMiner: Inferring domain-domain interactions from multiple sources of protein-protein interactions.PPIDomainMiner:从多种蛋白质相互作用源推断结构域-结构域相互作用。
PLoS Comput Biol. 2021 Aug 9;17(8):e1008844. doi: 10.1371/journal.pcbi.1008844. eCollection 2021 Aug.
InterPro蛋白质家族数据库:15年后的分类资源。
Nucleic Acids Res. 2015 Jan;43(Database issue):D213-21. doi: 10.1093/nar/gku1243. Epub 2014 Nov 26.
4
PDBe: Protein Data Bank in Europe.PDBe:欧洲蛋白质数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D285-91. doi: 10.1093/nar/gkt1180. Epub 2013 Nov 27.
5
Pfam: the protein families database.Pfam:蛋白质家族数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 Nov 27.
6
A large-scale evaluation of computational protein function prediction.大规模计算蛋白质功能预测评估。
Nat Methods. 2013 Mar;10(3):221-7. doi: 10.1038/nmeth.2340. Epub 2013 Jan 27.
7
SIFTS: Structure Integration with Function, Taxonomy and Sequences resource.SIFTS:结构整合与功能、分类学和序列资源。
Nucleic Acids Res. 2013 Jan;41(Database issue):D483-9. doi: 10.1093/nar/gks1258. Epub 2012 Nov 29.
8
DcGO: database of domain-centric ontologies on functions, phenotypes, diseases and more.DcGO:以功能、表型、疾病等为中心的本体数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D536-44. doi: 10.1093/nar/gks1080. Epub 2012 Nov 17.
9
UniRef: comprehensive and non-redundant UniProt reference clusters.UniRef:全面且无冗余的UniProt参考簇。
Bioinformatics. 2007 May 15;23(10):1282-8. doi: 10.1093/bioinformatics/btm098. Epub 2007 Mar 22.
10
Statistical tests for differential expression in cDNA microarray experiments.cDNA微阵列实验中差异表达的统计检验。
Genome Biol. 2003;4(4):210. doi: 10.1186/gb-2003-4-4-210. Epub 2003 Mar 17.