Nwadiugwu Martin C
Department of Biomedical Informatics, University of Nebraska Omaha, Omaha, NE, USA.
Bioinform Biol Insights. 2020 Apr 1;14:1177932220909851. doi: 10.1177/1177932220909851. eCollection 2020.
The current study seeks to compare 3 clustering algorithms that can be used in gene-based bioinformatics research to understand disease networks, protein-protein interaction networks, and gene expression data. Denclue, Fuzzy-C, and Balanced Iterative and Clustering using Hierarchies (BIRCH) were the 3 gene-based clustering algorithms selected. These algorithms were explored in relation to the subfield of bioinformatics that analyzes omics data, which include but are not limited to genomics, proteomics, metagenomics, transcriptomics, and metabolomics data. The objective was to compare the efficacy of the 3 algorithms and determine their strength and drawbacks. Result of the review showed that unlike Denclue and Fuzzy-C which are more efficient in handling noisy data, BIRCH can handle data set with outliers and have a better time complexity.
当前的研究旨在比较三种可用于基于基因的生物信息学研究的聚类算法,以了解疾病网络、蛋白质-蛋白质相互作用网络和基因表达数据。所选择的三种基于基因的聚类算法分别是Denclue算法、模糊C均值算法(Fuzzy-C)和平衡迭代分层聚类算法(BIRCH)。这些算法是针对生物信息学中分析组学数据的子领域进行探索的,组学数据包括但不限于基因组学、蛋白质组学、宏基因组学、转录组学和代谢组学数据。目的是比较这三种算法的有效性,并确定它们的优点和缺点。综述结果表明,与在处理噪声数据方面更有效的Denclue算法和模糊C均值算法不同,BIRCH算法可以处理含有异常值的数据集,并且具有更好的时间复杂度。