Suppr超能文献

基于主成分分析的大规模单细胞 RNA-seq 基准测试

Benchmarking principal component analysis for large-scale single-cell RNA-sequencing.

机构信息

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research, Wako, Saitama, 351-0198, Japan.

Japan Science and Technology Agency, PRESTO, 5-3, Yonbancho, Chiyoda-ku, Tokyo, 102-8666, Japan.

出版信息

Genome Biol. 2020 Jan 20;21(1):9. doi: 10.1186/s13059-019-1900-3.

Abstract

BACKGROUND

Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory.

RESULTS

In this work, we review the existing fast and memory-efficient PCA algorithms and implementations and evaluate their practical application to large-scale scRNA-seq datasets. Our benchmark shows that some PCA algorithms based on Krylov subspace and randomized singular value decomposition are fast, memory-efficient, and more accurate than the other algorithms.

CONCLUSION

We develop a guideline to select an appropriate PCA implementation based on the differences in the computational environment of users and developers.

摘要

背景

主成分分析(PCA)是分析单细胞 RNA 测序(scRNA-seq)数据集的一种基本方法,但对于大规模 scRNA-seq 数据集,计算时间长且消耗大量内存。

结果

在这项工作中,我们回顾了现有的快速且节省内存的 PCA 算法和实现,并评估了它们在大规模 scRNA-seq 数据集上的实际应用。我们的基准测试表明,一些基于 Krylov 子空间和随机奇异值分解的 PCA 算法速度快、节省内存且比其他算法更准确。

结论

我们根据用户和开发人员计算环境的差异,制定了一个选择合适 PCA 实现的指南。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dfe/6970290/dd9b159f0e18/13059_2019_1900_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验