Suppr超能文献

scRNMF:一种基于鲁棒非负矩阵分解的单细胞 RNA-seq 数据插补方法。

scRNMF: An imputation method for single-cell RNA-seq data by robust and non-negative matrix factorization.

机构信息

Institute Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.

出版信息

PLoS Comput Biol. 2024 Aug 8;20(8):e1012339. doi: 10.1371/journal.pcbi.1012339. eCollection 2024 Aug.

Abstract

Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool in genomics research, enabling the analysis of gene expression at the individual cell level. However, scRNA-seq data often suffer from a high rate of dropouts, where certain genes fail to be detected in specific cells due to technical limitations. This missing data can introduce biases and hinder downstream analysis. To overcome this challenge, the development of effective imputation methods has become crucial in the field of scRNA-seq data analysis. Here, we propose an imputation method based on robust and non-negative matrix factorization (scRNMF). Instead of other matrix factorization algorithms, scRNMF integrates two loss functions: L2 loss and C-loss. The L2 loss function is highly sensitive to outliers, which can introduce substantial errors. We utilize the C-loss function when dealing with zero values in the raw data. The primary advantage of the C-loss function is that it imposes a smaller punishment for larger errors, which results in more robust factorization when handling outliers. Various datasets of different sizes and zero rates are used to evaluate the performance of scRNMF against other state-of-the-art methods. Our method demonstrates its power and stability as a tool for imputation of scRNA-seq data.

摘要

单细胞 RNA 测序 (scRNA-seq) 已成为基因组学研究中的强大工具,能够在单个细胞水平上分析基因表达。然而,scRNA-seq 数据通常存在较高的缺失率,由于技术限制,某些基因在特定细胞中无法被检测到。这种缺失数据会引入偏差并阻碍下游分析。为了克服这一挑战,在 scRNA-seq 数据分析领域,开发有效的插补方法变得至关重要。在这里,我们提出了一种基于稳健非负矩阵分解 (scRNMF) 的插补方法。scRNMF 集成了两个损失函数:L2 损失和 C 损失,而不是其他矩阵分解算法。L2 损失函数对离群值非常敏感,这可能会引入较大的误差。我们在处理原始数据中的零值时使用 C 损失函数。C 损失函数的主要优点是,它对较大的误差施加较小的惩罚,从而在处理离群值时实现更稳健的分解。我们使用不同大小和零率的各种数据集来评估 scRNMF 相对于其他最先进方法的性能。我们的方法证明了其作为 scRNA-seq 数据插补工具的强大功能和稳定性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b31b/11338450/efbd6468edd6/pcbi.1012339.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验