Suppr超能文献

一种用于单细胞数据去噪的随机矩阵理论方法。

A Random Matrix Theory Approach to Denoise Single-Cell Data.

作者信息

Aparicio Luis, Bordyuh Mykola, Blumberg Andrew J, Rabadan Raul

机构信息

Department of Systems Biology, Columbia University, New York NY 10032, USA.

Department of Biomedical Informatics, Columbia University, New York NY 10032, USA.

出版信息

Patterns (N Y). 2020 May 4;1(3):100035. doi: 10.1016/j.patter.2020.100035. eCollection 2020 Jun 12.

Abstract

Single-cell technologies provide the opportunity to identify new cellular states. However, a major obstacle to the identification of biological signals is noise in single-cell data. In addition, single-cell data are very sparse. We propose a new method based on random matrix theory to analyze and denoise single-cell sequencing data. The method uses the universal distributions predicted by random matrix theory for the eigenvalues and eigenvectors of random covariance/Wishart matrices to distinguish noise from signal. In addition, we explain how sparsity can cause spurious eigenvector localization, falsely identifying meaningful directions in the data. We show that roughly 95% of the information in single-cell data is compatible with the predictions of random matrix theory, about 3% is spurious signal induced by sparsity, and only the last 2% reflects true biological signal. We demonstrate the effectiveness of our approach by comparing with alternative techniques in a variety of examples with marked cell populations.

摘要

单细胞技术为识别新的细胞状态提供了机会。然而,识别生物信号的一个主要障碍是单细胞数据中的噪声。此外,单细胞数据非常稀疏。我们提出了一种基于随机矩阵理论的新方法来分析和去噪单细胞测序数据。该方法使用随机矩阵理论预测的随机协方差/威沙特矩阵的特征值和特征向量的通用分布来区分噪声和信号。此外,我们解释了稀疏性如何导致虚假的特征向量定位,错误地识别数据中有意义的方向。我们表明,单细胞数据中大约95%的信息与随机矩阵理论的预测相符,约3%是由稀疏性引起的虚假信号,只有最后2%反映了真实的生物信号。我们通过在各种具有明显细胞群体的例子中与替代技术进行比较,证明了我们方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c77/7660363/b768b4d48e8a/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验