使用流形学习和单细胞 RNA-Seq 数据的增强可视化发现细胞类型。

Discovering cell types using manifold learning and enhanced visualization of single-cell RNA-Seq data.

机构信息

School of Computer Science, University of Windsor, Windsor, ON, Canada.

出版信息

Sci Rep. 2022 Jan 7;12(1):120. doi: 10.1038/s41598-021-03613-0.

DOI:10.1038/s41598-021-03613-0

PMID:34996927

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8742092/

Abstract

Identifying relevant disease modules such as target cell types is a significant step for studying diseases. High-throughput single-cell RNA-Seq (scRNA-seq) technologies have advanced in recent years, enabling researchers to investigate cells individually and understand their biological mechanisms. Computational techniques such as clustering, are the most suitable approach in scRNA-seq data analysis when the cell types have not been well-characterized. These techniques can be used to identify a group of genes that belong to a specific cell type based on their similar gene expression patterns. However, due to the sparsity and high-dimensionality of scRNA-seq data, classical clustering methods are not efficient. Therefore, the use of non-linear dimensionality reduction techniques to improve clustering results is crucial. We introduce a method that is used to identify representative clusters of different cell types by combining non-linear dimensionality reduction techniques and clustering algorithms. We assess the impact of different dimensionality reduction techniques combined with the clustering of thirteen publicly available scRNA-seq datasets of different tissues, sizes, and technologies. We further performed gene set enrichment analysis to evaluate the proposed method's performance. As such, our results show that modified locally linear embedding combined with independent component analysis yields overall the best performance relative to the existing unsupervised methods across different datasets.

摘要

鉴定相关疾病模块，如靶细胞类型，是研究疾病的重要步骤。近年来，高通量单细胞 RNA 测序（scRNA-seq）技术取得了进展，使研究人员能够单独研究细胞并了解其生物学机制。在细胞类型尚未很好表征的情况下，聚类等计算技术是 scRNA-seq 数据分析中最合适的方法。这些技术可用于根据相似的基因表达模式识别属于特定细胞类型的一组基因。然而，由于 scRNA-seq 数据的稀疏性和高维性，经典聚类方法效率不高。因此，使用非线性降维技术来提高聚类结果至关重要。我们介绍了一种通过结合非线性降维技术和聚类算法来识别不同细胞类型代表性簇的方法。我们评估了不同降维技术与十三个不同组织、大小和技术的公开 scRNA-seq 数据集的聚类相结合的影响。我们进一步进行了基因集富集分析来评估所提出方法的性能。因此，我们的结果表明，相对于现有无监督方法，修改后的局部线性嵌入与独立成分分析相结合在不同数据集上的整体性能最佳。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5cc5/8742092/a5d271bab9e6/41598_2021_3613_Fig1_HTML.jpg

相似文献

Discovering cell types using manifold learning and enhanced visualization of single-cell RNA-Seq data.

Sci Rep. 2022 Jan 7;12(1):120. doi: 10.1038/s41598-021-03613-0.

Single-cell analysis via manifold fitting: A framework for RNA clustering and beyond.

Proc Natl Acad Sci U S A. 2024 Sep 10;121(37):e2400002121. doi: 10.1073/pnas.2400002121. Epub 2024 Sep 3.

Contrastive self-supervised clustering of scRNA-seq data.

BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.

A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.

PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.

Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data.

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab304.

A Gene Rank Based Approach for Single Cell Similarity Assessment and Clustering.

IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):431-442. doi: 10.1109/TCBB.2019.2931582. Epub 2021 Apr 6.

DCRELM: dual correlation reduction network-based extreme learning machine for single-cell RNA-seq data clustering.

Sci Rep. 2024 Jun 12;14(1):13541. doi: 10.1038/s41598-024-64217-y.

ScCAEs: deep clustering of single-cell RNA-seq via convolutional autoencoder embedding and soft K-means.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab321.

Independent component analysis based gene co-expression network inference (ICAnet) to decipher functional modules for better single-cell clustering and batch integration.

Nucleic Acids Res. 2021 May 21;49(9):e54. doi: 10.1093/nar/gkab089.

GE-Impute: graph embedding-based imputation for single-cell RNA-seq data.

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac313.

引用本文的文献

Optimization of clustering parameters for single-cell RNA analysis using intrinsic goodness metrics.

Front Bioinform. 2025 Jun 11;5:1562410. doi: 10.3389/fbinf.2025.1562410. eCollection 2025.

scRL: Utilizing Reinforcement Learning to Evaluate Fate Decisions in Single-Cell Data.

Biology (Basel). 2025 Jun 11;14(6):679. doi: 10.3390/biology14060679.

Exploring RNA-Seq Data Analysis Through Visualization Techniques and Tools: A Systematic Review of Opportunities and Limitations for Clinical Applications.

Bioengineering (Basel). 2025 Jan 12;12(1):56. doi: 10.3390/bioengineering12010056.

scGAA: a general gated axial-attention model for accurate cell-type annotation of single-cell RNA-seq data.

Sci Rep. 2024 Sep 27;14(1):22308. doi: 10.1038/s41598-024-73356-1.

Vis-SPLIT: Interactive Hierarchical Modeling for mRNA Expression Classification.

IEEE Vis Conf. 2023 Oct;2023:106-110. doi: 10.1109/vis54172.2023.00030. Epub 2023 Dec 20.

nPCA: a linear dimensionality reduction method using a multilayer perceptron.

Front Genet. 2024 Jan 8;14:1290447. doi: 10.3389/fgene.2023.1290447. eCollection 2023.

The two-stage molecular scenery of SARS-CoV-2 infection with implications to disease severity: An in-silico quest.

Front Immunol. 2023 Nov 21;14:1251067. doi: 10.3389/fimmu.2023.1251067. eCollection 2023.

Single-Cell Transcriptomics for Unlocking Personalized Cancer Immunotherapy: Toward Targeting the Origin of Tumor Development Immunogenicity.

Cancers (Basel). 2023 Jul 14;15(14):3615. doi: 10.3390/cancers15143615.

Cell Type Annotation Model Selection: General-Purpose vs. Pattern-Aware Feature Gene Selection in Single-Cell RNA-Seq Data.

Genes (Basel). 2023 Feb 26;14(3):596. doi: 10.3390/genes14030596.

A new method for identifying industrial clustering using the standard deviational ellipse.

Sci Rep. 2023 Jan 11;13(1):578. doi: 10.1038/s41598-023-27655-8.

本文引用的文献

Transcriptomic profiling of SARS-CoV-2 infected human cell lines identifies HSP90 as target for COVID-19 therapy.

iScience. 2021 Mar 19;24(3):102151. doi: 10.1016/j.isci.2021.102151. Epub 2021 Feb 6.

Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.

Int J Mol Sci. 2020 Mar 22;21(6):2181. doi: 10.3390/ijms21062181.

Current best practices in single-cell RNA-seq analysis: a tutorial.

Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.

Recurrent herpes simplex virus-1 infection induces hallmarks of neurodegeneration and cognitive deficits in mice.

PLoS Pathog. 2019 Mar 14;15(3):e1007617. doi: 10.1371/journal.ppat.1007617. eCollection 2019 Mar.

Challenges in unsupervised clustering of single-cell RNA-seq data.

Nat Rev Genet. 2019 May;20(5):273-282. doi: 10.1038/s41576-018-0088-9.

Dimensionality reduction for visualizing single-cell data using UMAP.

Nat Biotechnol. 2018 Dec 3. doi: 10.1038/nbt.4314.

Comprehensive review of the identification of essential genes using computational methods: focusing on feature implementation and assessment.

Brief Bioinform. 2020 Jan 17;21(1):171-181. doi: 10.1093/bib/bby116.

Single-cell RNA sequencing technologies and bioinformatics pipelines.

Exp Mol Med. 2018 Aug 7;50(8):1-14. doi: 10.1038/s12276-018-0071-8.

SCANPY: large-scale single-cell gene expression data analysis.

Genome Biol. 2018 Feb 6;19(1):15. doi: 10.1186/s13059-017-1382-0.

Single-cell mRNA quantification and differential analysis with Census.

Nat Methods. 2017 Mar;14(3):309-315. doi: 10.1038/nmeth.4150. Epub 2017 Jan 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用流形学习和单细胞 RNA-Seq 数据的增强可视化发现细胞类型。

Discovering cell types using manifold learning and enhanced visualization of single-cell RNA-Seq data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献