共享信息对基因本体中语义计算的影响。

The effects of shared information on semantic calculations in the gene ontology.

作者信息

Bible Paul W, Sun Hong-Wei, Morasso Maria I, Loganantharaj Rasiah, Wei Lai

机构信息

State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China.

Biodata Mining and Discovery Section, Office of Science and Technology, Intramural Research Program, National Institute of Arthritis and Musculoskeletal and Skin Diseases, Bethesda, Maryland.

出版信息

Comput Struct Biotechnol J. 2017 Jan 30;15:195-211. doi: 10.1016/j.csbj.2017.01.009. eCollection 2017.

DOI:10.1016/j.csbj.2017.01.009

PMID:28217262

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5299144/

Abstract

The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts then substitutes this calculation into traditional term similarity measures such as Resnik, Lin, and Jiang-Conrath. Alternative SI approaches, when combined with ontology choice and term similarity type, lead to many gene-to-gene similarity measures. No thorough investigation has been made into the behavior, complexity, and performance of semantic methods derived from distinct SI approaches. We apply bootstrapping to compare the generalized performance of 57 gene-to-gene semantic measures across six benchmarks. Considering the number of measures, we additionally evaluate whether these methods can be leveraged through ensemble machine learning to improve prediction performance. Results showed that the choice of ontology type most strongly influenced performance across all evaluations. Combining measures into an ensemble classifier reduces cross-validation error beyond any individual measure for protein interaction prediction. This improvement resulted from information gained through the combination of ontology types as ensemble methods within each GO type offered no improvement. These results demonstrate that multiple SI measures can be leveraged for machine learning tasks such as automated gene function prediction by incorporating methods from across the ontologies. To facilitate future research in this area, we developed the GO Graph Tool Kit (GGTK), an open source C++ library with Python interface (github.com/paulbible/ggtk).

摘要

描述基因功能的结构化词汇表——基因本体论（GO），是生物学研究中的一个强大工具。GO在计算生物学中的一个应用是计算两个概念之间的语义相似性，以推断基因的功能相似性。一类术语相似性算法明确计算概念之间的共享信息（SI），然后将此计算代入传统的术语相似性度量，如雷斯尼克（Resnik）、林（Lin）和蒋 - 康拉特（Jiang - Conrath）度量。当与本体选择和术语相似性类型相结合时，不同的SI方法会产生许多基因对基因的相似性度量。对于源自不同SI方法的语义方法的行为、复杂性和性能，尚未进行全面研究。我们应用自举法在六个基准上比较57种基因对基因语义度量的广义性能。考虑到度量的数量，我们还评估了这些方法是否可以通过集成机器学习来提高预测性能。结果表明，在所有评估中，本体类型的选择对性能影响最大。将度量组合成一个集成分类器可降低蛋白质相互作用预测的交叉验证误差，超过任何单个度量。这种改进源于通过本体类型组合获得的信息，因为在每个GO类型中使用集成方法并没有带来改进。这些结果表明，通过整合来自不同本体的方法，多种SI度量可用于机器学习任务，如自动基因功能预测。为了促进该领域未来的研究，我们开发了GO图工具包（GGTK），这是一个带有Python接口的开源C++库（github.com/paulbible/ggtk）。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7b5/5299144/9a072876ef9d/fx1.jpg

相似文献

The effects of shared information on semantic calculations in the gene ontology.共享信息对基因本体中语义计算的影响。

Comput Struct Biotechnol J. 2017 Jan 30;15:195-211. doi: 10.1016/j.csbj.2017.01.009. eCollection 2017.

Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.基于基因本体术语语义相似性的蛋白质-蛋白质相互作用推断

J Theor Biol. 2016 Jul 21;401:30-7. doi: 10.1016/j.jtbi.2016.04.020. Epub 2016 Apr 23.

Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty.通过探索术语下的本体和建模不确定性来改进 GO 语义相似性度量。

Bioinformatics. 2012 May 15;28(10):1383-9. doi: 10.1093/bioinformatics/bts129. Epub 2012 Apr 19.

Correlation between gene expression and GO semantic similarity.基因表达与基因本体语义相似性之间的相关性。

IEEE/ACM Trans Comput Biol Bioinform. 2005 Oct-Dec;2(4):330-8. doi: 10.1109/TCBB.2005.50.

TopoICSim: a new semantic similarity measure based on gene ontology.TopoICSim：一种基于基因本体论的新语义相似性度量方法。

BMC Bioinformatics. 2016 Jul 29;17(1):296. doi: 10.1186/s12859-016-1160-0.

Influence of the go-based semantic similarity measures in multi-objective gene clustering algorithm performance.基于 GO 的语义相似度度量对多目标基因聚类算法性能的影响。

J Bioinform Comput Biol. 2020 Dec;18(6):2050038. doi: 10.1142/S0219720020500389. Epub 2020 Nov 5.

A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。

BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。

BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

Evolving knowledge graph similarity for supervised learning in complex biomedical domains.用于复杂生物医学领域中监督学习的进化知识图相似度。

BMC Bioinformatics. 2020 Jan 3;21(1):6. doi: 10.1186/s12859-019-3296-1.

Disjunctive shared information between ontology concepts: application to Gene Ontology.本体概念之间的析取共享信息：在基因本体中的应用

J Biomed Semantics. 2011 Aug 31;2:5. doi: 10.1186/2041-1480-2-5.

引用本文的文献

Artificial intelligence technology in ophthalmology public health: current applications and future directions.眼科公共卫生中的人工智能技术：当前应用与未来方向。

Front Cell Dev Biol. 2025 Apr 17;13:1576465. doi: 10.3389/fcell.2025.1576465. eCollection 2025.

Investigating changes of proteome in the bovine milk serum after retort processing using proteomics techniques.利用蛋白质组学技术研究高温瞬时灭菌处理后牛乳清中蛋白质组的变化。

Food Sci Nutr. 2021 Dec 30;10(2):307-316. doi: 10.1002/fsn3.2300. eCollection 2022 Feb.

An improved approach to infer protein-protein interaction based on a hierarchical vector space model.基于层次向量空间模型的改进蛋白质-蛋白质相互作用推断方法。

BMC Bioinformatics. 2018 Apr 27;19(1):161. doi: 10.1186/s12859-018-2152-z.

本文引用的文献

Gene Ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery.基因本体语义相似性工具：生物知识发现的特征与挑战综述

Brief Bioinform. 2017 Sep 1;18(5):886-901. doi: 10.1093/bib/bbw067.

Predictive Integration of Gene Ontology-Driven Similarity and Functional Interactions.基因本体驱动的相似性与功能相互作用的预测性整合

Proc IEEE Int Conf Data Min. 2006 Dec;2006:114-119. doi: 10.1109/ICDMW.2006.130.

Gene Expression Correlation and Gene Ontology-Based Similarity: An Assessment of Quantitative Relationships.基因表达相关性与基于基因本体论的相似性：定量关系评估

Proc IEEE Symp Comput Intell Bioinforma Comput Biol. 2004 Oct 7;2004:25-31. doi: 10.1109/CIBCB.2004.1393927.

Gene. 2015 Mar 1;558(1):108-17. doi: 10.1016/j.gene.2014.12.062. Epub 2014 Dec 28.

Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?基于信息内容的基因本体功能相似性度量：对于给定的生物数据类型应使用哪一种？

PLoS One. 2014 Dec 4;9(12):e113859. doi: 10.1371/journal.pone.0113859. eCollection 2014.

Using biological networks to improve our understanding of infectious diseases.利用生物网络提高我们对传染病的认识。

Comput Struct Biotechnol J. 2014 Aug 27;11(18):1-10. doi: 10.1016/j.csbj.2014.08.006. eCollection 2014 Aug.

The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective.知识不完整对蛋白质功能预测评估的影响：结构化输出学习视角

Bioinformatics. 2014 Sep 1;30(17):i609-16. doi: 10.1093/bioinformatics/btu472.

Pfam: the protein families database.Pfam：蛋白质家族数据库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 Nov 27.

The Reactome pathway knowledgebase.Reactome 通路知识库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D472-7. doi: 10.1093/nar/gkt1102. Epub 2013 Nov 15.

Information content-based gene ontology semantic similarity approaches: toward a unified framework theory.基于信息内容的基因本体语义相似性方法：迈向统一的框架理论。

Biomed Res Int. 2013;2013:292063. doi: 10.1155/2013/292063. Epub 2013 Sep 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

共享信息对基因本体中语义计算的影响。

The effects of shared information on semantic calculations in the gene ontology.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献