在基于同源性的框架内探索检测蛋白质功能相似性的方法。

Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework.

机构信息

Center for Biomedicine, European Academy of Bozen/Bolzano (EURAC), (Affiliated to the University of Lübeck, Lübeck, Germany), Viale Druso 1, 39100, Bolzano, Italy.

出版信息

Sci Rep. 2017 Mar 23;7(1):381. doi: 10.1038/s41598-017-00465-5.

DOI:10.1038/s41598-017-00465-5

PMID:28336965

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5428484/

Abstract

Protein functional similarity based on gene ontology (GO) annotations serves as a powerful tool when comparing proteins on a functional level in applications such as protein-protein interaction prediction, gene prioritization, and disease gene discovery. Functional similarity (FS) is usually quantified by combining the GO hierarchy with an annotation corpus that links genes and gene products to GO terms. One large group of algorithms involves calculation of GO term semantic similarity (SS) between all the terms annotating the two proteins, followed by a second step, described as "mixing strategy", which involves combining the SS values to yield the final FS value. Due to the variability of protein annotation caused e.g. by annotation bias, this value cannot be reliably compared on an absolute scale. We therefore introduce a similarity z-score that takes into account the FS background distribution of each protein. For a selection of popular SS measures and mixing strategies we demonstrate moderate accuracy improvement when using z-scores in a benchmark that aims to separate orthologous cases from random gene pairs and discuss in this context the impact of annotation corpus choice. The approach has been implemented in Frela, a fast high-throughput public web server for protein FS calculation and interpretation.

摘要

基于基因本体 (GO) 注释的蛋白质功能相似性在蛋白质功能水平比较方面是一种强大的工具，可应用于蛋白质-蛋白质相互作用预测、基因优先级和疾病基因发现等领域。功能相似性 (FS) 通常通过将 GO 层次结构与注释语料库相结合来量化，该语料库将基因和基因产物与 GO 术语联系起来。一类大型算法涉及计算注释两个蛋白质的所有术语之间的 GO 术语语义相似性 (SS)，然后是第二步，描述为“混合策略”，涉及组合 SS 值以得出最终的 FS 值。由于蛋白质注释的可变性，例如注释偏差，因此不能在绝对尺度上可靠地比较此值。因此，我们引入了相似性 z 分数，该分数考虑了每个蛋白质的 FS 背景分布。对于选择的流行 SS 度量和混合策略，我们在旨在将同源案例与随机基因对分开的基准测试中展示了适度的准确性提高，并在该上下文中讨论了注释语料库选择的影响。该方法已在 Frela 中实现，Frela 是一个快速的高通量公共网络服务器，用于计算和解释蛋白质 FS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e270/5428484/54c45d5b4244/41598_2017_465_Fig1_HTML.jpg

相似文献

Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework.在基于同源性的框架内探索检测蛋白质功能相似性的方法。

Sci Rep. 2017 Mar 23;7(1):381. doi: 10.1038/s41598-017-00465-5.

Gene Ontology Enrichment Improves Performances of Functional Similarity of Genes.基因本体论富集提高了基因功能相似性的性能。

Sci Rep. 2018 Aug 14;8(1):12100. doi: 10.1038/s41598-018-30455-0.

Evaluating the significance of protein functional similarity based on gene ontology.基于基因本体论评估蛋白质功能相似性的重要性。

J Comput Biol. 2014 Nov;21(11):809-22. doi: 10.1089/cmb.2014.0181. Epub 2014 Sep 4.

Methods Mol Biol. 2017;1446:161-173. doi: 10.1007/978-1-4939-3743-1_12.

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。

BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。

BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.

Measuring gene functional similarity based on group-wise comparison of GO terms.基于 GO 术语的组间比较来衡量基因功能相似性。

Bioinformatics. 2013 Jun 1;29(11):1424-32. doi: 10.1093/bioinformatics/btt160. Epub 2013 Apr 9.

A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.A-DaGO-Fun：一种基于基因本体语义相似性的适应性功能分析工具。

Bioinformatics. 2016 Feb 1;32(3):477-9. doi: 10.1093/bioinformatics/btv590. Epub 2015 Oct 17.

NoisyGOA: Noisy GO annotations prediction using taxonomic and semantic similarity.NoisyGOA：利用分类学和语义相似性预测有噪声的基因本体注释

Comput Biol Chem. 2016 Dec;65:203-211. doi: 10.1016/j.compbiolchem.2016.09.005. Epub 2016 Sep 13.

Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach.通过结合基因本体和共功能网络改进语义相似性测量：一种基于随机游走的方法。

BMC Syst Biol. 2018 Mar 19;12(Suppl 2):18. doi: 10.1186/s12918-018-0539-0.

引用本文的文献

The Identification of Candidate Biomarkers and Pathways in Atherosclerosis by Integrated Bioinformatics Analysis.通过综合生物信息学分析鉴定动脉粥样硬化的候选生物标志物和途径。

Comput Math Methods Med. 2021 Nov 10;2021:6276480. doi: 10.1155/2021/6276480. eCollection 2021.

STarFish: A Stacked Ensemble Target Fishing Approach and its Application to Natural Products.STarFish：一种堆叠集成目标捕捞方法及其在天然产物中的应用。

J Chem Inf Model. 2019 Nov 25;59(11):4906-4920. doi: 10.1021/acs.jcim.9b00489. Epub 2019 Oct 24.

The arrhythmogenic cardiomyopathy-specific coding and non-coding transcriptome in human cardiac stromal cells.人类心脏基质细胞中的致心律失常性心肌病特异性编码和非编码转录组。

BMC Genomics. 2018 Jun 25;19(1):491. doi: 10.1186/s12864-018-4876-6.

本文引用的文献

Methods Mol Biol. 2017;1446:161-173. doi: 10.1007/978-1-4939-3743-1_12.

The Gene Ontology and the Meaning of Biological Function.基因本体论与生物学功能的意义。

Methods Mol Biol. 2017;1446:15-24. doi: 10.1007/978-1-4939-3743-1_2.

Exploring information from the topology beneath the Gene Ontology terms to improve semantic similarity measures.探索基因本体术语之下的拓扑结构中的信息以改进语义相似性度量。

Gene. 2016 Jul 15;586(1):148-57. doi: 10.1016/j.gene.2016.04.024. Epub 2016 Apr 12.

Missing value imputation for microRNA expression data by using a GO-based similarity measure.基于基因本体（GO）相似性度量的微小RNA表达数据缺失值插补

BMC Bioinformatics. 2016 Jan 11;17 Suppl 1(Suppl 1):10. doi: 10.1186/s12859-015-0853-0.

Dintor: functional annotation of genomic and proteomic data.Dintor：基因组和蛋白质组数据的功能注释。

BMC Genomics. 2015 Dec 21;16:1081. doi: 10.1186/s12864-015-2279-5.

A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.A-DaGO-Fun：一种基于基因本体语义相似性的适应性功能分析工具。

Bioinformatics. 2016 Feb 1;32(3):477-9. doi: 10.1093/bioinformatics/btv590. Epub 2015 Oct 17.

Software Suite for Gene and Protein Annotation Prediction and Similarity Search.用于基因和蛋白质注释预测及相似性搜索的软件套件。

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jul-Aug;12(4):837-43. doi: 10.1109/TCBB.2014.2382127.

Measure the Semantic Similarity of GO Terms Using Aggregate Information Content.使用聚合信息内容测量基因本体术语的语义相似性。

IEEE/ACM Trans Comput Biol Bioinform. 2014 May-Jun;11(3):468-76. doi: 10.1109/TCBB.2013.176.

Measuring semantic similarities by combining gene ontology annotations and gene co-function networks.通过结合基因本体注释和基因共功能网络来测量语义相似性。

BMC Bioinformatics. 2015 Feb 14;16:44. doi: 10.1186/s12859-015-0474-7.

Gene Expression Correlation and Gene Ontology-Based Similarity: An Assessment of Quantitative Relationships.基因表达相关性与基于基因本体论的相似性：定量关系评估

Proc IEEE Symp Comput Intell Bioinforma Comput Biol. 2004 Oct 7;2004:25-31. doi: 10.1109/CIBCB.2004.1393927.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在基于同源性的框架内探索检测蛋白质功能相似性的方法。

Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献