Suppr超能文献

nPCA:一种使用多层感知器的线性降维方法。

nPCA: a linear dimensionality reduction method using a multilayer perceptron.

作者信息

Li Juzeng, Wang Yi

机构信息

Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai, China.

Human Phenome Institute, Fudan University, Shanghai, China.

出版信息

Front Genet. 2024 Jan 8;14:1290447. doi: 10.3389/fgene.2023.1290447. eCollection 2023.

Abstract

Linear dimensionality reduction techniques are widely used in many applications. The goal of dimensionality reduction is to eliminate the noise of data and extract the main features of data. Several dimension reduction methods have been developed, such as linear-based principal component analysis (PCA), nonlinear-based t-distributed stochastic neighbor embedding (t-SNE), and deep-learning-based autoencoder (AE). However, PCA only determines the projection direction with the highest variance, t-SNE is sometimes only suitable for visualization, and AE and nonlinear methods discard the linear projection. To retain the linear projection of raw data and generate a better result of dimension reduction either for visualization or downstream analysis, we present neural principal component analysis (nPCA), an unsupervised deep learning approach capable of retaining richer information of raw data as a promising improvement to PCA. To evaluate the performance of the nPCA algorithm, we compare the performance of 10 public datasets and 6 single-cell RNA sequencing (scRNA-seq) datasets of the pancreas, benchmarking our method with other classic linear dimensionality reduction methods. We concluded that the nPCA method is a competitive alternative method for dimensionality reduction tasks.

摘要

线性降维技术在许多应用中被广泛使用。降维的目标是消除数据噪声并提取数据的主要特征。已经开发了几种降维方法,例如基于线性的主成分分析(PCA)、基于非线性的t分布随机邻域嵌入(t-SNE)以及基于深度学习的自动编码器(AE)。然而,PCA仅确定具有最高方差的投影方向,t-SNE有时仅适用于可视化,并且AE和非线性方法会丢弃线性投影。为了保留原始数据的线性投影并为可视化或下游分析生成更好的降维结果,我们提出了神经主成分分析(nPCA),这是一种无监督深度学习方法,能够保留原始数据更丰富的信息,作为对PCA的一种有前景的改进。为了评估nPCA算法的性能,我们比较了10个公共数据集和6个胰腺单细胞RNA测序(scRNA-seq)数据集的性能,将我们的方法与其他经典线性降维方法进行基准测试。我们得出结论,nPCA方法是降维任务的一种有竞争力的替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc9/10800564/b84ea0b8d8cf/fgene-14-1290447-g001.jpg

相似文献

1
nPCA: a linear dimensionality reduction method using a multilayer perceptron.
Front Genet. 2024 Jan 8;14:1290447. doi: 10.3389/fgene.2023.1290447. eCollection 2023.
2
Visualization of Single Cell RNA-Seq Data Using t-SNE in R.
Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.
3
Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis.
Int J Mol Sci. 2020 Aug 12;21(16):5797. doi: 10.3390/ijms21165797.
4
A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data.
Front Genet. 2021 Mar 23;12:646936. doi: 10.3389/fgene.2021.646936. eCollection 2021.
5
Performance comparison of dimensionality reduction methods on RNA-Seq data from the GTEx project.
Genes Genomics. 2020 Feb;42(2):225-234. doi: 10.1007/s13258-019-00896-6. Epub 2019 Dec 12.
6
UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study.
J Phys Chem B. 2021 May 20;125(19):5022-5034. doi: 10.1021/acs.jpcb.1c02081. Epub 2021 May 11.
7
Dimensionality Reduction of Single-Cell RNA-Seq Data.
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
8
Capturing discrete latent structures: choose LDs over PCs.
Biostatistics. 2022 Dec 12;24(1):1-16. doi: 10.1093/biostatistics/kxab030.
9
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.
Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.

本文引用的文献

3
A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data.
Front Genet. 2021 Mar 23;12:646936. doi: 10.3389/fgene.2021.646936. eCollection 2021.
4
Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis.
Int J Mol Sci. 2020 Aug 12;21(16):5797. doi: 10.3390/ijms21165797.
5
Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.
Front Genet. 2019 Apr 5;10:317. doi: 10.3389/fgene.2019.00317. eCollection 2019.
6
SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation.
Bioinformatics. 2019 Oct 1;35(19):3642-3650. doi: 10.1093/bioinformatics/btz139.
7
A novel type of activation function in artificial neural networks: Trained activation function.
Neural Netw. 2018 Mar;99:148-157. doi: 10.1016/j.neunet.2018.01.007. Epub 2018 Jan 31.
8
Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning.
Nat Methods. 2017 Apr;14(4):414-416. doi: 10.1038/nmeth.4207. Epub 2017 Mar 6.
9
A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure.
Cell Syst. 2016 Oct 26;3(4):346-360.e4. doi: 10.1016/j.cels.2016.08.011. Epub 2016 Sep 22.
10
Classification of low quality cells from single-cell RNA-seq data.
Genome Biol. 2016 Feb 17;17:29. doi: 10.1186/s13059-016-0888-1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验