Suppr超能文献

使用流形学习和单细胞 RNA-Seq 数据的增强可视化发现细胞类型。

Discovering cell types using manifold learning and enhanced visualization of single-cell RNA-Seq data.

机构信息

School of Computer Science, University of Windsor, Windsor, ON, Canada.

出版信息

Sci Rep. 2022 Jan 7;12(1):120. doi: 10.1038/s41598-021-03613-0.

Abstract

Identifying relevant disease modules such as target cell types is a significant step for studying diseases. High-throughput single-cell RNA-Seq (scRNA-seq) technologies have advanced in recent years, enabling researchers to investigate cells individually and understand their biological mechanisms. Computational techniques such as clustering, are the most suitable approach in scRNA-seq data analysis when the cell types have not been well-characterized. These techniques can be used to identify a group of genes that belong to a specific cell type based on their similar gene expression patterns. However, due to the sparsity and high-dimensionality of scRNA-seq data, classical clustering methods are not efficient. Therefore, the use of non-linear dimensionality reduction techniques to improve clustering results is crucial. We introduce a method that is used to identify representative clusters of different cell types by combining non-linear dimensionality reduction techniques and clustering algorithms. We assess the impact of different dimensionality reduction techniques combined with the clustering of thirteen publicly available scRNA-seq datasets of different tissues, sizes, and technologies. We further performed gene set enrichment analysis to evaluate the proposed method's performance. As such, our results show that modified locally linear embedding combined with independent component analysis yields overall the best performance relative to the existing unsupervised methods across different datasets.

摘要

鉴定相关疾病模块,如靶细胞类型,是研究疾病的重要步骤。近年来,高通量单细胞 RNA 测序(scRNA-seq)技术取得了进展,使研究人员能够单独研究细胞并了解其生物学机制。在细胞类型尚未很好表征的情况下,聚类等计算技术是 scRNA-seq 数据分析中最合适的方法。这些技术可用于根据相似的基因表达模式识别属于特定细胞类型的一组基因。然而,由于 scRNA-seq 数据的稀疏性和高维性,经典聚类方法效率不高。因此,使用非线性降维技术来提高聚类结果至关重要。我们介绍了一种通过结合非线性降维技术和聚类算法来识别不同细胞类型代表性簇的方法。我们评估了不同降维技术与十三个不同组织、大小和技术的公开 scRNA-seq 数据集的聚类相结合的影响。我们进一步进行了基因集富集分析来评估所提出方法的性能。因此,我们的结果表明,相对于现有无监督方法,修改后的局部线性嵌入与独立成分分析相结合在不同数据集上的整体性能最佳。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5cc5/8742092/a5d271bab9e6/41598_2021_3613_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验