School of Mathematics and Statistics, Lanzhou University, Lanzhou, China.
Department of Data Science and Information Technology, Taiz University, Taiz, Yemen.
BMC Bioinformatics. 2022 Jul 21;23(1):288. doi: 10.1186/s12859-022-04826-4.
Methods for the multiview clustering and integration of multi-omics data have been developed recently to solve problems caused by data noise or limited sample size and to integrate multi-omics data with consistent (common) and differential cluster patterns. However, the integration of such data still suffers from limited performance and low accuracy.
In this study, a computational framework for the multiview clustering method based on the penalty model is presented to overcome the challenges of low accuracy and limited performance in the case of integrating multi-omics data with consistent (common) and differential cluster patterns. The performance of the proposed method was evaluated on synthetic data and four real multi-omics data and then compared with approaches presented in the literature under different scenarios. Result implies that our method exhibits competitive performance compared with recently developed techniques when the underlying clusters are consistent with synthetic data. In the case of the differential clusters, the proposed method also presents an enhanced performance. In addition, with regards to real omics data, the developed method exhibits better performance, demonstrating its ability to provide more detailed information within each data type and working better to integrate multi-omics data with consistent (common) and differential cluster patterns. This study shows that the proposed method offers more significant differences in survival times across all types of cancer.
A new multiview clustering method is proposed in this study based on synthetic and real data. This method performs better than other techniques previously presented in the literature in terms of integrating multi-omics data with consistent and differential cluster patterns and determining the significance of difference in survival times.
最近已经开发出了多视图聚类和整合多组学数据的方法,以解决由数据噪声或有限的样本量引起的问题,并整合具有一致(常见)和差异聚类模式的多组学数据。然而,这种数据的整合仍然受到性能有限和准确性低的困扰。
在这项研究中,提出了一种基于惩罚模型的多视图聚类方法的计算框架,以克服在整合具有一致(常见)和差异聚类模式的多组学数据时准确性低和性能有限的挑战。该方法的性能在合成数据和四个真实多组学数据上进行了评估,然后在不同的情况下与文献中的方法进行了比较。结果表明,在基础聚类与合成数据一致的情况下,与最近开发的技术相比,我们的方法具有竞争力。在差异聚类的情况下,所提出的方法也表现出了增强的性能。此外,对于真实的组学数据,所开发的方法表现出了更好的性能,表明它能够在每种数据类型内提供更详细的信息,并更好地整合具有一致(常见)和差异聚类模式的多组学数据。这项研究表明,所提出的方法在所有类型的癌症中提供了更显著的生存时间差异。
本研究基于合成数据和真实数据提出了一种新的多视图聚类方法。与文献中以前提出的其他技术相比,该方法在整合具有一致和差异聚类模式的多组学数据以及确定生存时间差异的显著性方面表现更好。