• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于估计基因产物语义相似性的最短路径图核。

A shortest-path graph kernel for estimating gene product semantic similarity.

作者信息

Alvarez Marco A, Qi Xiaojun, Yan Changhui

机构信息

Department of Computer Science, North Dakota State University, Fargo, 58108, USA.

出版信息

J Biomed Semantics. 2011 Jul 29;2:3. doi: 10.1186/2041-1480-2-3.

DOI:10.1186/2041-1480-2-3
PMID:21801410
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3161911/
Abstract

BACKGROUND

Existing methods for calculating semantic similarity between gene products using the Gene Ontology (GO) often rely on external resources, which are not part of the ontology. Consequently, changes in these external resources like biased term distribution caused by shifting of hot research topics, will affect the calculation of semantic similarity. One way to avoid this problem is to use semantic methods that are "intrinsic" to the ontology, i.e. independent of external knowledge.

RESULTS

We present a shortest-path graph kernel (spgk) method that relies exclusively on the GO and its structure. In spgk, a gene product is represented by an induced subgraph of the GO, which consists of all the GO terms annotating it. Then a shortest-path graph kernel is used to compute the similarity between two graphs. In a comprehensive evaluation using a benchmark dataset, spgk compares favorably with other methods that depend on external resources. Compared with simUI, a method that is also intrinsic to GO, spgk achieves slightly better results on the benchmark dataset. Statistical tests show that the improvement is significant when the resolution and EC similarity correlation coefficient are used to measure the performance, but is insignificant when the Pfam similarity correlation coefficient is used.

CONCLUSIONS

Spgk uses a graph kernel method in polynomial time to exploit the structure of the GO to calculate semantic similarity between gene products. It provides an alternative to both methods that use external resources and "intrinsic" methods with comparable performance.

摘要

背景

现有的利用基因本体论(GO)计算基因产物之间语义相似性的方法通常依赖于外部资源,而这些资源并非本体的一部分。因此,这些外部资源的变化,如热门研究主题转移导致的术语分布偏差,会影响语义相似性的计算。避免此问题的一种方法是使用本体“内在”的语义方法,即独立于外部知识的方法。

结果

我们提出了一种仅依赖于GO及其结构的最短路径图核(spgk)方法。在spgk中,基因产物由GO的一个诱导子图表示,该子图由注释它的所有GO术语组成。然后使用最短路径图核来计算两个图之间的相似性。在使用基准数据集进行的综合评估中,spgk与其他依赖外部资源的方法相比具有优势。与同样是GO内在方法的simUI相比,spgk在基准数据集上取得了稍好的结果。统计测试表明,当使用分辨率和EC相似性相关系数来衡量性能时,改进是显著的,但当使用Pfam相似性相关系数时,改进并不显著。

结论

Spgk在多项式时间内使用图核方法来利用GO的结构计算基因产物之间的语义相似性。它为使用外部资源的方法和具有可比性能的“内在”方法提供了一种替代方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/785e/3161911/978f838a8a57/2041-1480-2-3-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/785e/3161911/978f838a8a57/2041-1480-2-3-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/785e/3161911/978f838a8a57/2041-1480-2-3-1.jpg

相似文献

1
A shortest-path graph kernel for estimating gene product semantic similarity.一种用于估计基因产物语义相似性的最短路径图核。
J Biomed Semantics. 2011 Jul 29;2:3. doi: 10.1186/2041-1480-2-3.
2
A graph-based semantic similarity measure for the gene ontology.一种基于图的基因本体语义相似性度量方法。
J Bioinform Comput Biol. 2011 Dec;9(6):681-95. doi: 10.1142/s0219720011005641.
3
Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph.使用信息内容和基因本体论图的拓扑属性评估蛋白质之间的语义相似性。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):839-849. doi: 10.1109/TCBB.2017.2689762. Epub 2017 Mar 31.
4
IntelliGO: a new vector-based semantic similarity measure including annotation origin.IntelliGO:一种新的基于向量的语义相似性度量方法,包含注释来源。
BMC Bioinformatics. 2010 Dec 1;11:588. doi: 10.1186/1471-2105-11-588.
5
A New Path Based Hybrid Measure for Gene Ontology Similarity.一种基于新路径的基因本体相似性混合度量方法。
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):116-27. doi: 10.1109/TCBB.2013.149.
6
TopoICSim: a new semantic similarity measure based on gene ontology.TopoICSim:一种基于基因本体论的新语义相似性度量方法。
BMC Bioinformatics. 2016 Jul 29;17(1):296. doi: 10.1186/s12859-016-1160-0.
7
HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey.HESML:生物医学领域的实时语义度量库,附有可重现的调查。
BMC Bioinformatics. 2022 Jan 6;23(1):23. doi: 10.1186/s12859-021-04539-0.
8
An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology.一种利用基因本体论中的语义相似度来改进蛋白质-蛋白质相互作用评分的方法。
BMC Bioinformatics. 2010 Nov 15;11:562. doi: 10.1186/1471-2105-11-562.
9
A novel insight into Gene Ontology semantic similarity.基因本体论语义相似性的新见解。
Genomics. 2013 Jun;101(6):368-75. doi: 10.1016/j.ygeno.2013.04.010. Epub 2013 Apr 26.
10
Characterisation of semantic similarity on gene ontology based on a shortest path approach.基于最短路径方法的基因本体语义相似性表征
Int J Data Min Bioinform. 2014;10(1):33-48. doi: 10.1504/ijdmb.2014.062887.

引用本文的文献

1
Construction and Multiple Feature Classification Based on a High-Order Functional Hypernetwork on fMRI Data.基于功能磁共振成像数据的高阶功能超网络的构建与多特征分类
Front Neurosci. 2022 Apr 13;16:848363. doi: 10.3389/fnins.2022.848363. eCollection 2022.
2
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.
3
HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences.

本文引用的文献

1
The Universal Protein Resource (UniProt) in 2010.2010 年的通用蛋白质资源(UniProt)。
Nucleic Acids Res. 2010 Jan;38(Database issue):D142-8. doi: 10.1093/nar/gkp846. Epub 2009 Oct 20.
2
Semantic similarity in biomedical ontologies.生物医学本体中的语义相似性。
PLoS Comput Biol. 2009 Jul;5(7):e1000443. doi: 10.1371/journal.pcbi.1000443. Epub 2009 Jul 31.
3
G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery.G-SESAME:用于基于基因本体论术语的基因相似性分析和知识发现的网络工具。
HPO2GO:利用交叉本体注释共现情况预测蛋白质的人类表型本体术语关联
PeerJ. 2018 Aug 2;6:e5298. doi: 10.7717/peerj.5298. eCollection 2018.
4
Machine Learning Classification Combining Multiple Features of A Hyper-Network of fMRI Data in Alzheimer's Disease.结合阿尔茨海默病功能性磁共振成像数据超网络多特征的机器学习分类
Front Neurosci. 2017 Nov 21;11:615. doi: 10.3389/fnins.2017.00615. eCollection 2017.
5
TopoICSim: a new semantic similarity measure based on gene ontology.TopoICSim:一种基于基因本体论的新语义相似性度量方法。
BMC Bioinformatics. 2016 Jul 29;17(1):296. doi: 10.1186/s12859-016-1160-0.
6
Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?基于信息内容的基因本体功能相似性度量:对于给定的生物数据类型应使用哪一种?
PLoS One. 2014 Dec 4;9(12):e113859. doi: 10.1371/journal.pone.0113859. eCollection 2014.
7
Frequent and discriminative subnetwork mining for mild cognitive impairment classification.用于轻度认知障碍分类的频繁且有区分性的子网挖掘
Brain Connect. 2014 Jun;4(5):347-60. doi: 10.1089/brain.2013.0214.
8
Information content-based gene ontology semantic similarity approaches: toward a unified framework theory.基于信息内容的基因本体语义相似性方法:迈向统一的框架理论。
Biomed Res Int. 2013;2013:292063. doi: 10.1155/2013/292063. Epub 2013 Sep 2.
9
Topological graph kernel on multiple thresholded functional connectivity networks for mild cognitive impairment classification.基于多重阈值化功能连接网络的拓扑图核用于轻度认知障碍分类。
Hum Brain Mapp. 2014 Jul;35(7):2876-97. doi: 10.1002/hbm.22353. Epub 2013 Sep 13.
10
A topology-based metric for measuring term similarity in the gene ontology.一种用于衡量基因本体中术语相似性的基于拓扑结构的度量方法。
Adv Bioinformatics. 2012;2012:975783. doi: 10.1155/2012/975783. Epub 2012 May 15.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W345-9. doi: 10.1093/nar/gkp463. Epub 2009 Jun 2.
4
Rapid annotation of anonymous sequences from genome projects using semantic similarities and a weighting scheme in gene ontology.利用基因本体论中的语义相似性和加权方案对来自基因组计划的匿名序列进行快速注释。
PLoS One. 2009;4(2):e4619. doi: 10.1371/journal.pone.0004619. Epub 2009 Feb 27.
5
Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data.利用酿酒酵母蛋白质相互作用和表达谱数据评估基于GO的功能相似性度量
BMC Bioinformatics. 2008 Nov 6;9:472. doi: 10.1186/1471-2105-9-472.
6
A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。
BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.
7
The GOA database in 2009--an integrated Gene Ontology Annotation resource.2009年的基因本体注释(GOA)数据库——一个整合的基因本体注释资源。
Nucleic Acids Res. 2009 Jan;37(Database issue):D396-403. doi: 10.1093/nar/gkn803. Epub 2008 Oct 27.
8
Gene Ontology term overlap as a measure of gene functional similarity.基因本体术语重叠作为基因功能相似性的一种度量。
BMC Bioinformatics. 2008 Aug 4;9:327. doi: 10.1186/1471-2105-9-327.
9
Metrics for GO based protein semantic similarity: a systematic evaluation.基于基因本体论(GO)的蛋白质语义相似性度量:系统评估
BMC Bioinformatics. 2008 Apr 29;9 Suppl 5(Suppl 5):S4. doi: 10.1186/1471-2105-9-S5-S4.
10
The Pfam protein families database.Pfam蛋白质家族数据库。
Nucleic Acids Res. 2008 Jan;36(Database issue):D281-8. doi: 10.1093/nar/gkm960. Epub 2007 Nov 26.