School of Mathematics and Information Engineering, Longyan University, China.
Department of Automation, Xiamen University, China.
J Biomed Inform. 2021 Oct;122:103899. doi: 10.1016/j.jbi.2021.103899. Epub 2021 Sep 3.
Single-cell RNA sequencing (scRNA-seq) is fast becoming a powerful technology that revolutionizes biomedical studies related to development, immunology and cancer by providing genome-scale transcriptional profiles at unprecedented throughput and resolution. However, due to the low capture rate and frequent drop-out events in the sequencing process, scRNA-seq data suffer from extremely high sparsity and variability, challenging the data analysis. Here we proposed a novel method called scLINE for learning low dimensional representations of scRNA-seq data. scLINE is based on the network embedding model that jointly considers multiple gene-gene interaction networks, facilitating the incorporation of prior biological knowledge for signal extraction. We comprehensively evaluated scLINE on eight single-cell datasets. Results show that scLINE achieved comparable or higher performance than competing methods, including PCA, t-SNE and Isomap, in terms of internal validation metrics and clustering accuracy. The low dimensional representations learned by scLINE are effective for downstream single-cell analysis, such as visualization, clustering and cell typing. We have implemented scLINE as an easy-to-use R package, which can be incorporated in other existing scRNA-seq analysis pipelines or tools for data preprocessing.
单细胞 RNA 测序 (scRNA-seq) 正迅速成为一种强大的技术,通过提供前所未有的高通量和分辨率的基因组转录谱,彻底改变了与发育、免疫学和癌症相关的生物医学研究。然而,由于测序过程中的低捕获率和频繁的缺失事件,scRNA-seq 数据具有极高的稀疏性和可变性,这给数据分析带来了挑战。在这里,我们提出了一种名为 scLINE 的新方法,用于学习 scRNA-seq 数据的低维表示。scLINE 基于网络嵌入模型,该模型联合考虑了多个基因-基因相互作用网络,有助于提取信号,同时结合了先前的生物学知识。我们在八个单细胞数据集上全面评估了 scLINE。结果表明,scLINE 在内部验证指标和聚类准确性方面,与 PCA、t-SNE 和 Isomap 等竞争方法的性能相当或更高。scLINE 学习的低维表示对于下游单细胞分析(如可视化、聚类和细胞分型)非常有效。我们已经将 scLINE 实现为一个易于使用的 R 包,可集成到其他现有的 scRNA-seq 分析管道或工具中,用于数据预处理。