Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China.
Int J Mol Sci. 2024 Feb 13;25(4):2234. doi: 10.3390/ijms25042234.
Single-cell RNA sequencing (scRNA-seq) data reveal the complexity and diversity of cellular ecosystems and molecular interactions in various biomedical research. Hence, identifying cell types from large-scale scRNA-seq data using existing annotations is challenging and requires stable and interpretable methods. However, the current cell type identification methods have limited performance, mainly due to the intrinsic heterogeneity among cell populations and extrinsic differences between datasets. Here, we present a robust graph artificial intelligence model, a multi-view graph convolutional network model (scMGCN) that integrates multiple graph structures from raw scRNA-seq data and applies graph convolutional networks with attention mechanisms to learn cell embeddings and predict cell labels. We evaluate our model on single-dataset, cross-species, and cross-platform experiments and compare it with other state-of-the-art methods. Our results show that scMGCN outperforms the other methods regarding stability, accuracy, and robustness to batch effects. Our main contributions are as follows: Firstly, we introduce multi-view learning and multiple graph construction methods to capture comprehensive cellular information from scRNA-seq data. Secondly, we construct a scMGCN that combines graph convolutional networks with attention mechanisms to extract shared, high-order information from cells. Finally, we demonstrate the effectiveness and superiority of the scMGCN on various datasets.
单细胞 RNA 测序 (scRNA-seq) 数据揭示了各种生物医学研究中细胞生态系统和分子相互作用的复杂性和多样性。因此,使用现有注释从大规模 scRNA-seq 数据中识别细胞类型具有挑战性,需要稳定且可解释的方法。然而,当前的细胞类型识别方法的性能有限,主要是由于细胞群体之间的固有异质性和数据集之间的外在差异。在这里,我们提出了一种强大的图人工智能模型,即多视图图卷积网络模型 (scMGCN),它整合了原始 scRNA-seq 数据中的多个图结构,并应用带有注意力机制的图卷积网络来学习细胞嵌入并预测细胞标签。我们在单数据集、跨物种和跨平台实验中评估了我们的模型,并将其与其他最先进的方法进行了比较。我们的结果表明,scMGCN 在稳定性、准确性和对批次效应的鲁棒性方面优于其他方法。我们的主要贡献如下:首先,我们引入了多视图学习和多种图构建方法,从 scRNA-seq 数据中捕获全面的细胞信息。其次,我们构建了一个 scMGCN,它结合了图卷积网络和注意力机制,从细胞中提取共享的、高阶信息。最后,我们在各种数据集上证明了 scMGCN 的有效性和优越性。