基于新型对称的基因-基因相异度度量方法，并利用基因本体论：在基因聚类中的应用。

Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering.

机构信息

Department of Computer Science and Engineering, IIT Patna, India.

出版信息

Gene. 2018 Dec 30;679:341-351. doi: 10.1016/j.gene.2018.08.062. Epub 2018 Sep 2.

DOI:10.1016/j.gene.2018.08.062

Abstract

In recent years DNA microarray technology, leading to the generation of high-volume biological data, has gained significant attention. To analyze this high volume gene-expression data, one such powerful tool is Clustering. For any clustering algorithm, its efficiency majorly depends upon the underlying similarity/dissimilarity measure. During the analysis of such data often there is a need to further explore the similarity of genes not only with respect to their expression values but also with respect to their functional annotations, which can be obtained from Gene Ontology (GO) databases. In the existing literature, several novel clustering and bi-clustering approaches were proposed to identify co-regulated genes from gene-expression datasets. Identifying co-regulated genes from gene expression data misses some important biological information about functionalities of genes, which is necessary to identify semantically related genes. In this paper, we have proposed sixteen different semantic gene-gene dissimilarity measures utilizing biological information of genes retrieved from a global biological database namely Gene Ontology (GO). Four proximity measures, viz. Euclidean, Cosine, point symmetry and line symmetry are utilized along with four different representations of gene-GO-term annotation vectors to develop total sixteen gene-gene dissimilarity measures. In order to illustrate the profitability of developed dissimilarity measures, some multi-objective as well as single-objective clustering algorithms are applied utilizing proposed measures to identify functionally similar genes from Mouse genome and Yeast datasets. Furthermore, we have compared the performance of our proposed sixteen dissimilarity measures with three existing state-of-the-art semantic similarity and distance measures.

摘要

近年来，DNA 微阵列技术生成了大量的生物数据，引起了广泛关注。为了分析这些大量的基因表达数据，聚类是一种强大的工具。对于任何聚类算法，其效率主要取决于底层的相似性/相异性度量。在分析此类数据时，通常需要进一步探索基因的相似性，不仅要考虑它们的表达值，还要考虑它们的功能注释，这些注释可以从基因本体论（GO）数据库中获得。在现有文献中，已经提出了几种新的聚类和双聚类方法，用于从基因表达数据集中识别共调控基因。从基因表达数据中识别共调控基因会忽略有关基因功能的一些重要生物学信息，这些信息对于识别语义相关基因是必要的。在本文中，我们提出了十六种不同的语义基因-基因差异度量方法，利用从全球生物数据库（即基因本体论（GO））中检索到的基因生物学信息。利用四种接近度度量方法（即欧几里得、余弦、点对称和线对称）以及四种不同的基因-GO 术语注释向量表示形式，共开发了十六种基因-基因差异度量方法。为了说明开发的差异度量方法的盈利能力，我们应用了一些多目标和单目标聚类算法，利用所提出的方法从鼠基因组和酵母数据集识别功能相似的基因。此外，我们还将我们提出的十六种差异度量方法的性能与三种现有的最先进的语义相似性和距离度量方法进行了比较。

相似文献

Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering.基于新型对称的基因-基因相异度度量方法，并利用基因本体论：在基因聚类中的应用。

Gene. 2018 Dec 30;679:341-351. doi: 10.1016/j.gene.2018.08.062. Epub 2018 Sep 2.

Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering.多因素基因-基因邻近度度量方法，利用从基因本体论中提取的生物学知识：在基因聚类中的应用。

IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):207-219. doi: 10.1109/TCBB.2018.2849362. Epub 2018 Jun 21.

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。

BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

Influence of the go-based semantic similarity measures in multi-objective gene clustering algorithm performance.基于 GO 的语义相似度度量对多目标基因聚类算法性能的影响。

J Bioinform Comput Biol. 2020 Dec;18(6):2050038. doi: 10.1142/S0219720020500389. Epub 2020 Nov 5.

Incorporating gene ontology into fuzzy relational clustering of microarray gene expression data.将基因本体论纳入微阵列基因表达数据的模糊关系聚类中。

Biosystems. 2018 Jan;163:1-10. doi: 10.1016/j.biosystems.2017.09.017. Epub 2017 Nov 4.

Measuring semantic similarities by combining gene ontology annotations and gene co-function networks.通过结合基因本体注释和基因共功能网络来测量语义相似性。

BMC Bioinformatics. 2015 Feb 14;16:44. doi: 10.1186/s12859-015-0474-7.

Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph.使用信息内容和基因本体论图的拓扑属性评估蛋白质之间的语义相似性。

IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):839-849. doi: 10.1109/TCBB.2017.2689762. Epub 2017 Mar 31.

Dynamically weighted clustering with noise set.带噪声集的动态加权聚类。

Bioinformatics. 2010 Feb 1;26(3):341-7. doi: 10.1093/bioinformatics/btp671. Epub 2009 Dec 9.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.超越共表达关系：时移和反向基因表达谱的局部聚类可识别新的生物学相关相互作用。

J Mol Biol. 2001 Dec 14;314(5):1053-66. doi: 10.1006/jmbi.2000.5219.

Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.使用功能类别参考集评估基因表达数据聚类算法的方法。

BMC Bioinformatics. 2006 Aug 31;7:397. doi: 10.1186/1471-2105-7-397.

引用本文的文献

Single-Cell Analysis of Endothelial Cell Injury in IgA Nephropathy.IgA肾病中内皮细胞损伤的单细胞分析

Immun Inflamm Dis. 2025 Feb;13(2):e70149. doi: 10.1002/iid3.70149.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于新型对称的基因-基因相异度度量方法，并利用基因本体论：在基因聚类中的应用。

Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献