Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 230601 Hefei, China.
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae068.
Cluster assignment is vital to analyzing single-cell RNA sequencing (scRNA-seq) data to understand high-level biological processes. Deep learning-based clustering methods have recently been widely used in scRNA-seq data analysis. However, existing deep models often overlook the interconnections and interactions among network layers, leading to the loss of structural information within the network layers. Herein, we develop a new self-supervised clustering method based on an adaptive multi-scale autoencoder, called scAMAC. The self-supervised clustering network utilizes the Multi-Scale Attention mechanism to fuse the feature information from the encoder, hidden and decoder layers of the multi-scale autoencoder, which enables the exploration of cellular correlations within the same scale and captures deep features across different scales. The self-supervised clustering network calculates the membership matrix using the fused latent features and optimizes the clustering network based on the membership matrix. scAMAC employs an adaptive feedback mechanism to supervise the parameter updates of the multi-scale autoencoder, obtaining a more effective representation of cell features. scAMAC not only enables cell clustering but also performs data reconstruction through the decoding layer. Through extensive experiments, we demonstrate that scAMAC is superior to several advanced clustering and imputation methods in both data clustering and reconstruction. In addition, scAMAC is beneficial for downstream analysis, such as cell trajectory inference. Our scAMAC model codes are freely available at https://github.com/yancy2024/scAMAC.
聚类分配对于分析单细胞 RNA 测序 (scRNA-seq) 数据以了解高级生物过程至关重要。基于深度学习的聚类方法最近在 scRNA-seq 数据分析中得到了广泛应用。然而,现有的深度学习模型往往忽略了网络层之间的相互连接和相互作用,导致网络层内的结构信息丢失。在此,我们开发了一种新的基于自适应多尺度自动编码器的自监督聚类方法,称为 scAMAC。自监督聚类网络利用多尺度注意力机制融合来自多尺度自动编码器的编码器、隐藏层和解码器层的特征信息,从而能够在同一尺度内探索细胞相关性,并捕获不同尺度上的深度特征。自监督聚类网络使用融合的潜在特征计算隶属矩阵,并基于隶属矩阵优化聚类网络。scAMAC 采用自适应反馈机制来监督多尺度自动编码器的参数更新,从而获得更有效的细胞特征表示。scAMAC 不仅可以进行细胞聚类,还可以通过解码层进行数据重建。通过广泛的实验,我们证明 scAMAC 在数据聚类和重建方面均优于几种先进的聚类和插补方法。此外,scAMAC 有利于下游分析,如细胞轨迹推断。我们的 scAMAC 模型代码可在 https://github.com/yancy2024/scAMAC 上免费获取。