Huang Yixiang, Jiang Hao, Ching Wai-Ki
School of Mathematics, Renmin University of China, No. 59 Zhong guancun Street, 100872, Beijing, China.
Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong.
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae203.
With the emergence of large amount of single-cell RNA sequencing (scRNA-seq) data, the exploration of computational methods has become critical in revealing biological mechanisms. Clustering is a representative for deciphering cellular heterogeneity embedded in scRNA-seq data. However, due to the diversity of datasets, none of the existing single-cell clustering methods shows overwhelming performance on all datasets. Weighted ensemble methods are proposed to integrate multiple results to improve heterogeneity analysis performance. These methods are usually weighted by considering the reliability of the base clustering results, ignoring the performance difference of the same base clustering on different cells. In this paper, we propose a high-order element-wise weighting strategy based self-representative ensemble learning framework: scEWE. By assigning different base clustering weights to individual cells, we construct and optimize the consensus matrix in a careful and exquisite way. In addition, we extracted the high-order information between cells, which enhanced the ability to represent the similarity relationship between cells. scEWE is experimentally shown to significantly outperform the state-of-the-art methods, which strongly demonstrates the effectiveness of the method and supports the potential applications in complex single-cell data analytical problems.
随着大量单细胞RNA测序(scRNA-seq)数据的出现,探索计算方法对于揭示生物学机制变得至关重要。聚类是解读scRNA-seq数据中所蕴含细胞异质性的一种代表性方法。然而,由于数据集的多样性,现有的单细胞聚类方法在所有数据集上都没有表现出压倒性的性能。加权集成方法被提出来整合多个结果以提高异质性分析性能。这些方法通常通过考虑基础聚类结果的可靠性来加权,却忽略了相同基础聚类在不同细胞上的性能差异。在本文中,我们提出了一种基于高阶逐元素加权策略的自代表集成学习框架:scEWE。通过为单个细胞分配不同的基础聚类权重,我们精心构建并优化了一致性矩阵。此外,我们提取了细胞间的高阶信息,增强了表示细胞间相似关系的能力。实验表明,scEWE显著优于现有方法,有力地证明了该方法的有效性,并支持其在复杂单细胞数据分析问题中的潜在应用。