• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种利用基因本体论中的语义相似度来改进蛋白质-蛋白质相互作用评分的方法。

An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology.

机构信息

Department of Computer Science, University of Toronto, 10 Kings College Road, Toronto, Ontario M5S3G4, Canada.

出版信息

BMC Bioinformatics. 2010 Nov 15;11:562. doi: 10.1186/1471-2105-11-562.

DOI:10.1186/1471-2105-11-562
PMID:21078182
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2998529/
Abstract

BACKGROUND

Semantic similarity measures are useful to assess the physiological relevance of protein-protein interactions (PPIs). They quantify similarity between proteins based on their function using annotation systems like the Gene Ontology (GO). Proteins that interact in the cell are likely to be in similar locations or involved in similar biological processes compared to proteins that do not interact. Thus the more semantically similar the gene function annotations are among the interacting proteins, more likely the interaction is physiologically relevant. However, most semantic similarity measures used for PPI confidence assessment do not consider the unequal depth of term hierarchies in different classes of cellular location, molecular function, and biological process ontologies of GO and thus may over-or under-estimate similarity.

RESULTS

We describe an improved algorithm, Topological Clustering Semantic Similarity (TCSS), to compute semantic similarity between GO terms annotated to proteins in interaction datasets. Our algorithm, considers unequal depth of biological knowledge representation in different branches of the GO graph. The central idea is to divide the GO graph into sub-graphs and score PPIs higher if participating proteins belong to the same sub-graph as compared to if they belong to different sub-graphs.

CONCLUSIONS

The TCSS algorithm performs better than other semantic similarity measurement techniques that we evaluated in terms of their performance on distinguishing true from false protein interactions, and correlation with gene expression and protein families. We show an average improvement of 4.6 times the F1 score over Resnik, the next best method, on our Saccharomyces cerevisiae PPI dataset and 2 times on our Homo sapiens PPI dataset using cellular component, biological process and molecular function GO annotations.

摘要

背景

语义相似性度量对于评估蛋白质-蛋白质相互作用(PPIs)的生理相关性非常有用。它们基于注释系统(如基因本体论(GO))根据蛋白质的功能对蛋白质进行相似性量化。与不相互作用的蛋白质相比,在细胞中相互作用的蛋白质可能处于相似的位置或参与相似的生物过程。因此,相互作用蛋白的基因功能注释之间的语义相似性越高,相互作用就越具有生理相关性。然而,用于 PPI 置信度评估的大多数语义相似性度量并未考虑 GO 中细胞位置、分子功能和生物过程本体不同类别的术语层次结构的不等深度,因此可能会高估或低估相似性。

结果

我们描述了一种改进的算法,拓扑聚类语义相似性(TCSS),用于计算交互数据集中标注蛋白质的 GO 术语之间的语义相似性。我们的算法考虑了 GO 图不同分支中生物知识表示的不等深度。核心思想是将 GO 图划分为子图,如果参与蛋白质属于同一子图,则将其划分为子图,而不是不同的子图。

结论

与我们评估的其他语义相似性测量技术相比,TCSS 算法在区分真实和虚假蛋白质相互作用以及与基因表达和蛋白质家族的相关性方面表现更好。我们在酿酒酵母 PPI 数据集上使用细胞成分、生物过程和分子功能 GO 注释,将 Resnik(下一个最佳方法)的 F1 得分提高了 4.6 倍,在人类 PPI 数据集上提高了 2 倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/e8bbc13dfd10/1471-2105-11-562-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/a8d668065513/1471-2105-11-562-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/6665493c8299/1471-2105-11-562-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/80d10c59dc9c/1471-2105-11-562-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/e1f5d8641e96/1471-2105-11-562-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/e8bbc13dfd10/1471-2105-11-562-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/a8d668065513/1471-2105-11-562-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/6665493c8299/1471-2105-11-562-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/80d10c59dc9c/1471-2105-11-562-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/e1f5d8641e96/1471-2105-11-562-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b27e/2998529/e8bbc13dfd10/1471-2105-11-562-5.jpg

相似文献

1
An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology.一种利用基因本体论中的语义相似度来改进蛋白质-蛋白质相互作用评分的方法。
BMC Bioinformatics. 2010 Nov 15;11:562. doi: 10.1186/1471-2105-11-562.
2
Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph.使用信息内容和基因本体论图的拓扑属性评估蛋白质之间的语义相似性。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):839-849. doi: 10.1109/TCBB.2017.2689762. Epub 2017 Mar 31.
3
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.
4
An improved approach to infer protein-protein interaction based on a hierarchical vector space model.基于层次向量空间模型的改进蛋白质-蛋白质相互作用推断方法。
BMC Bioinformatics. 2018 Apr 27;19(1):161. doi: 10.1186/s12859-018-2152-z.
5
Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins.基于基因本体论的语义相似性度量的比较分析及其在识别必需蛋白质中的应用。
PLoS One. 2023 Apr 21;18(4):e0284274. doi: 10.1371/journal.pone.0284274. eCollection 2023.
6
Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering.多因素基因-基因邻近度度量方法,利用从基因本体论中提取的生物学知识:在基因聚类中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):207-219. doi: 10.1109/TCBB.2018.2849362. Epub 2018 Jun 21.
7
A graph-based semantic similarity measure for the gene ontology.一种基于图的基因本体语义相似性度量方法。
J Bioinform Comput Biol. 2011 Dec;9(6):681-95. doi: 10.1142/s0219720011005641.
8
Interspecies gene function prediction using semantic similarity.基于语义相似性的跨物种基因功能预测
BMC Syst Biol. 2016 Dec 23;10(Suppl 4):121. doi: 10.1186/s12918-016-0361-5.
9
A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。
BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.
10
Evaluating the effect of annotation size on measures of semantic similarity.评估注释大小对语义相似性度量的影响。
J Biomed Semantics. 2017 Feb 13;8(1):7. doi: 10.1186/s13326-017-0119-z.

引用本文的文献

1
Identification of Five NK Cell-Related Hub Genes in COPD Using Single-Cell RNA Sequencing Analysis.利用单细胞RNA测序分析鉴定慢性阻塞性肺疾病中五个与自然杀伤细胞相关的关键基因
J Inflamm Res. 2025 Feb 12;18:2169-2183. doi: 10.2147/JIR.S491298. eCollection 2025.
2
Convergent pairs of highly transcribed genes restrict chromatin looping in Dictyostelium discoideum.高度转录基因的汇聚对限制了盘基网柄菌中的染色质环化。
Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkaf006.
3
GOTermViewer: Visualization of Gene Ontology Enrichment in Multiple Differential Gene Expression Analyses.

本文引用的文献

1
The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function.GeneMANIA 预测服务器:用于基因优先级排序和预测基因功能的生物网络集成。
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W214-20. doi: 10.1093/nar/gkq537.
2
GOSemSim: an R package for measuring semantic similarity among GO terms and gene products.GO 语义相似度分析:用于测量 GO 术语和基因产物之间语义相似性的 R 包。
Bioinformatics. 2010 Apr 1;26(7):976-8. doi: 10.1093/bioinformatics/btq064. Epub 2010 Feb 23.
3
Ontology engineering.
基因本体术语查看器:多种差异基因表达分析中基因本体富集的可视化
Bioinform Biol Insights. 2024 Sep 18;18:11779322241271550. doi: 10.1177/11779322241271550. eCollection 2024.
4
simona: a comprehensive R package for semantic similarity analysis on bio-ontologies.Simona:一个用于生物本体语义相似性分析的综合 R 包。
BMC Genomics. 2024 Sep 16;25(1):869. doi: 10.1186/s12864-024-10759-4.
5
Integrated systems biology approach identifies gene targets for endothelial dysfunction.综合系统生物学方法鉴定内皮功能障碍的基因靶标。
Mol Syst Biol. 2023 Dec 6;19(12):e11462. doi: 10.15252/msb.202211462. Epub 2023 Nov 30.
6
Genome-wide identification of the opsin protein in and comparison with other fungi (pathogens of ).全基因组范围内对视蛋白的鉴定及与其他真菌(的病原体)的比较。 (你提供的原文句子似乎不太完整,表述有些奇怪,可能影响理解。)
Front Microbiol. 2023 Aug 25;14:1193892. doi: 10.3389/fmicb.2023.1193892. eCollection 2023.
7
Cross-phyla protein annotation by structural prediction and alignment.跨门蛋白质注释通过结构预测和比对。
Genome Biol. 2023 May 12;24(1):113. doi: 10.1186/s13059-023-02942-9.
8
Large-scale phage-based screening reveals extensive pan-viral mimicry of host short linear motifs.大规模噬菌体筛选揭示了宿主短线性基序的广泛泛病毒模拟。
Nat Commun. 2023 Apr 26;14(1):2409. doi: 10.1038/s41467-023-38015-5.
9
Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins.基于基因本体论的语义相似性度量的比较分析及其在识别必需蛋白质中的应用。
PLoS One. 2023 Apr 21;18(4):e0284274. doi: 10.1371/journal.pone.0284274. eCollection 2023.
10
Computational models for prediction of protein-protein interaction in rice and .用于预测水稻中蛋白质-蛋白质相互作用的计算模型以及…… (原文此处不完整)
Front Plant Sci. 2023 Feb 1;13:1046209. doi: 10.3389/fpls.2022.1046209. eCollection 2022.
本体工程。
Nat Biotechnol. 2010 Feb;28(2):128-30. doi: 10.1038/nbt0210-128.
4
The Universal Protein Resource (UniProt) in 2010.2010 年的通用蛋白质资源(UniProt)。
Nucleic Acids Res. 2010 Jan;38(Database issue):D142-8. doi: 10.1093/nar/gkp846. Epub 2009 Oct 20.
5
Semantic similarity in biomedical ontologies.生物医学本体中的语义相似性。
PLoS Comput Biol. 2009 Jul;5(7):e1000443. doi: 10.1371/journal.pcbi.1000443. Epub 2009 Jul 31.
6
Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data.利用酿酒酵母蛋白质相互作用和表达谱数据评估基于GO的功能相似性度量
BMC Bioinformatics. 2008 Nov 6;9:472. doi: 10.1186/1471-2105-9-472.
7
iRefIndex: a consolidated protein interaction database with provenance.iRefIndex:一个具有来源信息的整合蛋白质相互作用数据库。
BMC Bioinformatics. 2008 Sep 30;9:405. doi: 10.1186/1471-2105-9-405.
8
High-quality binary protein interaction map of the yeast interactome network.酵母相互作用组网络的高质量二元蛋白质相互作用图谱。
Science. 2008 Oct 3;322(5898):104-10. doi: 10.1126/science.1158684. Epub 2008 Aug 21.
9
Pathway analysis reveals functional convergence of gene expression profiles in breast cancer.通路分析揭示了乳腺癌基因表达谱的功能趋同。
BMC Med Genomics. 2008 Jun 27;1:28. doi: 10.1186/1755-8794-1-28.
10
Metrics for GO based protein semantic similarity: a systematic evaluation.基于基因本体论(GO)的蛋白质语义相似性度量:系统评估
BMC Bioinformatics. 2008 Apr 29;9 Suppl 5(Suppl 5):S4. doi: 10.1186/1471-2105-9-S5-S4.