Suppr超能文献

RUV-III-NB:单细胞 RNA-seq 数据的标准化。

RUV-III-NB: normalization of single cell RNA-seq data.

机构信息

Melbourne School of Population and Global Health, University of Melbourne, VIC 3053, Australia.

Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research Parkville, VIC 3052, Australia.

出版信息

Nucleic Acids Res. 2022 Sep 9;50(16):e96. doi: 10.1093/nar/gkac486.

Abstract

Normalization of single cell RNA-seq data remains a challenging task. The performance of different methods can vary greatly between datasets when unwanted factors and biology are associated. Most normalization methods also only remove the effects of unwanted variation for the cell embedding but not from gene-level data typically used for differential expression (DE) analysis to identify marker genes. We propose RUV-III-NB, a method that can be used to remove unwanted variation from both the cell embedding and gene-level counts. Using pseudo-replicates, RUV-III-NB explicitly takes into account potential association with biology when removing unwanted variation. The method can be used for both UMI or read counts and returns adjusted counts that can be used for downstream analyses such as clustering, DE and pseudotime analyses. Using published datasets with different technological platforms, kinds of biology and levels of association between biology and unwanted variation, we show that RUV-III-NB manages to remove library size and batch effects, strengthen biological signals, improve DE analyses, and lead to results exhibiting greater concordance with independent datasets of the same kind. The performance of RUV-III-NB is consistent and is not sensitive to the number of factors assumed to contribute to the unwanted variation.

摘要

单细胞 RNA-seq 数据的标准化仍然是一项具有挑战性的任务。当与生物学相关的不需要的因素存在时,不同方法在不同数据集上的性能可能会有很大差异。大多数标准化方法也只能去除细胞嵌入物中不需要的变异的影响,但不能去除用于差异表达 (DE) 分析以识别标记基因的基因水平数据。我们提出了 RUV-III-NB 方法,该方法可用于从细胞嵌入物和基因水平计数中去除不需要的变异。使用伪重复,RUV-III-NB 在去除不需要的变异时明确考虑了与生物学的潜在关联。该方法可用于 UMI 或读取计数,并返回可用于下游分析(如聚类、DE 和拟时分析)的调整计数。使用具有不同技术平台、生物学种类和生物学与不需要的变异之间关联程度的已发表数据集,我们表明 RUV-III-NB 能够去除文库大小和批次效应,增强生物学信号,改善 DE 分析,并导致与相同类型的独立数据集具有更高一致性的结果。RUV-III-NB 的性能是一致的,并且不受假设对不需要的变异有贡献的因素数量的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b14a/9458465/af723477d4b8/gkac486fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验