Ma Yuanyuan, Zhao Junmin, Ma Yingjun
School of Computer & Information Engineering, Anyang Normal University, Anyang, China.
School of Computer & Data Science, Henan University of Urban Construction, Pingdingshan, China.
BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):234. doi: 10.1186/s12859-020-03555-w.
With the rapid development of high-throughput technique, multiple heterogeneous omics data have been accumulated vastly (e.g., genomics, proteomics and metabolomics data). Integrating information from multiple sources or views is challenging to obtain a profound insight into the complicated relations among micro-organisms, nutrients and host environment. In this paper we propose a multi-view Hessian regularization based symmetric nonnegative matrix factorization algorithm (MHSNMF) for clustering heterogeneous microbiome data. Compared with many existing approaches, the advantages of MHSNMF lie in: (1) MHSNMF combines multiple Hessian regularization to leverage the high-order information from the same cohort of instances with multiple representations; (2) MHSNMF utilities the advantages of SNMF and naturally handles the complex relationship among microbiome samples; (3) uses the consensus matrix obtained by MHSNMF, we also design a novel approach to predict the classification of new microbiome samples.
We conduct extensive experiments on two real-word datasets (Three-source dataset and Human Microbiome Plan dataset), the experimental results show that the proposed MHSNMF algorithm outperforms other baseline and state-of-the-art methods. Compared with other methods, MHSNMF achieves the best performance (accuracy: 95.28%, normalized mutual information: 91.79%) on microbiome data. It suggests the potential application of MHSNMF in microbiome data analysis.
Results show that the proposed MHSNMF algorithm can effectively combine the phylogenetic, transporter, and metabolic profiles into a unified paradigm to analyze the relationships among different microbiome samples. Furthermore, the proposed prediction method based on MHSNMF has been shown to be effective in judging the types of new microbiome samples.
随着高通量技术的快速发展,大量多源异构组学数据(如基因组学、蛋白质组学和代谢组学数据)得以积累。整合来自多个来源或视角的信息对于深入了解微生物、营养物质和宿主环境之间的复杂关系具有挑战性。在本文中,我们提出了一种基于多视角海森正则化的对称非负矩阵分解算法(MHSNMF)用于对异构微生物组数据进行聚类。与许多现有方法相比,MHSNMF的优势在于:(1)MHSNMF结合多个海森正则化以利用具有多种表示的同一批实例的高阶信息;(2)MHSNMF利用了SNMF的优势并自然地处理微生物组样本之间的复杂关系;(3)利用MHSNMF获得的共识矩阵,我们还设计了一种新颖的方法来预测新微生物组样本的分类。
我们在两个真实世界数据集(三源数据集和人类微生物组计划数据集)上进行了广泛实验,实验结果表明所提出的MHSNMF算法优于其他基线方法和现有最先进方法。与其他方法相比,MHSNMF在微生物组数据上取得了最佳性能(准确率:95.28%,标准化互信息:91.79%)。这表明MHSNMF在微生物组数据分析中的潜在应用价值。
结果表明,所提出的MHSNMF算法能够有效地将系统发育、转运体和代谢谱整合到一个统一的范式中,以分析不同微生物组样本之间的关系。此外,所提出的基于MHSNMF的预测方法已被证明在判断新微生物组样本的类型方面是有效的。