• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于解决单细胞转录组数据集分析中挑战的数据标准化。

Data normalization for addressing the challenges in the analysis of single-cell transcriptomic datasets.

机构信息

Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, Nuevo Leon, 64710, Mexico.

The Vivian L. Smith Department of Neurosurgery, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

出版信息

BMC Genomics. 2024 May 6;25(1):444. doi: 10.1186/s12864-024-10364-5.

DOI:10.1186/s12864-024-10364-5
PMID:38711017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11073985/
Abstract

BACKGROUND

Normalization is a critical step in the analysis of single-cell RNA-sequencing (scRNA-seq) datasets. Its main goal is to make gene counts comparable within and between cells. To do so, normalization methods must account for technical and biological variability. Numerous normalization methods have been developed addressing different sources of dispersion and making specific assumptions about the count data.

MAIN BODY

The selection of a normalization method has a direct impact on downstream analysis, for example differential gene expression and cluster identification. Thus, the objective of this review is to guide the reader in making an informed decision on the most appropriate normalization method to use. To this aim, we first give an overview of the different single cell sequencing platforms and methods commonly used including isolation and library preparation protocols. Next, we discuss the inherent sources of variability of scRNA-seq datasets. We describe the categories of normalization methods and include examples of each. We also delineate imputation and batch-effect correction methods. Furthermore, we describe data-driven metrics commonly used to evaluate the performance of normalization methods. We also discuss common scRNA-seq methods and toolkits used for integrated data analysis.

CONCLUSIONS

According to the correction performed, normalization methods can be broadly classified as within and between-sample algorithms. Moreover, with respect to the mathematical model used, normalization methods can further be classified into: global scaling methods, generalized linear models, mixed methods, and machine learning-based methods. Each of these methods depict pros and cons and make different statistical assumptions. However, there is no better performing normalization method. Instead, metrics such as silhouette width, K-nearest neighbor batch-effect test, or Highly Variable Genes are recommended to assess the performance of normalization methods.

摘要

背景

归一化是单细胞 RNA 测序(scRNA-seq)数据分析的关键步骤。其主要目标是使细胞内和细胞间的基因计数具有可比性。为此,归一化方法必须考虑技术和生物学变异性。已经开发了许多归一化方法来解决不同的分散源,并对计数数据做出特定假设。

主要内容

归一化方法的选择对下游分析有直接影响,例如差异基因表达和聚类识别。因此,本综述的目的是指导读者就使用最合适的归一化方法做出明智的决定。为此,我们首先概述了不同的单细胞测序平台和常用的方法,包括分离和文库制备方案。接下来,我们讨论了 scRNA-seq 数据集固有的变异性来源。我们描述了归一化方法的类别,并包括每种方法的示例。我们还划定了插补和批次效应校正方法。此外,我们描述了常用的数据驱动指标来评估归一化方法的性能。我们还讨论了用于集成数据分析的常见 scRNA-seq 方法和工具包。

结论

根据所执行的校正,归一化方法可以大致分为样本内和样本间算法。此外,根据所使用的数学模型,归一化方法可以进一步分为:全局缩放方法、广义线性模型、混合方法和基于机器学习的方法。这些方法中的每一种都有其优点和缺点,并做出不同的统计假设。然而,没有一种归一化方法表现更好。相反,建议使用轮廓宽度、K-最近邻批效应测试或高度可变基因等指标来评估归一化方法的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/42f514823be1/12864_2024_10364_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/c5979b47f833/12864_2024_10364_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/397023f3452c/12864_2024_10364_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/42f514823be1/12864_2024_10364_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/c5979b47f833/12864_2024_10364_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/397023f3452c/12864_2024_10364_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1729/11075207/42f514823be1/12864_2024_10364_Fig3_HTML.jpg

相似文献

1
Data normalization for addressing the challenges in the analysis of single-cell transcriptomic datasets.用于解决单细胞转录组数据集分析中挑战的数据标准化。
BMC Genomics. 2024 May 6;25(1):444. doi: 10.1186/s12864-024-10364-5.
2
Normalization of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的归一化处理。
Methods Mol Biol. 2021;2284:303-329. doi: 10.1007/978-1-0716-1307-8_17.
3
A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.单细胞 RNA 测序研究中差异表达分析的统计方法综合综述。
Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947.
4
Single-Cell RNA Sequencing Analysis: A Step-by-Step Overview.单细胞 RNA 测序分析:分步概述。
Methods Mol Biol. 2021;2284:343-365. doi: 10.1007/978-1-0716-1307-8_19.
5
Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data.单细胞 ATAC-seq 数据基因集评分算法的基准测试。
Genomics Proteomics Bioinformatics. 2024 Jul 3;22(2). doi: 10.1093/gpbjnl/qzae014.
6
A multicenter study benchmarking single-cell RNA sequencing technologies using reference samples.一项使用参考试样对单细胞 RNA 测序技术进行基准测试的多中心研究。
Nat Biotechnol. 2021 Sep;39(9):1103-1114. doi: 10.1038/s41587-020-00748-9. Epub 2020 Dec 21.
7
Machine learning and statistical methods for clustering single-cell RNA-sequencing data.机器学习和统计方法在单细胞 RNA 测序数据分析中的应用。
Brief Bioinform. 2020 Jul 15;21(4):1209-1223. doi: 10.1093/bib/bbz063.
8
Deep learning tackles single-cell analysis-a survey of deep learning for scRNA-seq analysis.深度学习应对单细胞分析——深度学习在 scRNA-seq 分析中的应用综述。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab531.
9
bayNorm: Bayesian gene expression recovery, imputation and normalization for single-cell RNA-sequencing data.bayNorm:用于单细胞 RNA-seq 数据的贝叶斯基因表达恢复、插补和标准化。
Bioinformatics. 2020 Feb 15;36(4):1174-1181. doi: 10.1093/bioinformatics/btz726.
10
Data Analysis in Single-Cell Transcriptome Sequencing.单细胞转录组测序中的数据分析
Methods Mol Biol. 2018;1754:311-326. doi: 10.1007/978-1-4939-7717-8_18.

引用本文的文献

1
Establishing single cell RNA transcriptomics: a brief guide.建立单细胞RNA转录组学:简要指南。
Front Zool. 2025 Sep 2;22(1):25. doi: 10.1186/s12983-025-00579-x.
2
Pathway Analysis Interpretation in the Multi-Omic Era.多组学时代的通路分析解读
BioTech (Basel). 2025 Jul 29;14(3):58. doi: 10.3390/biotech14030058.
3
Sketching T cell atlases in the single-cell era: challenges and recommendations.单细胞时代绘制T细胞图谱:挑战与建议

本文引用的文献

1
Deep learning applications in single-cell genomics and transcriptomics data analysis.深度学习在单细胞基因组学和转录组学数据分析中的应用。
Biomed Pharmacother. 2023 Sep;165:115077. doi: 10.1016/j.biopha.2023.115077. Epub 2023 Jul 1.
2
Comparison of transformations for single-cell RNA-seq data.单细胞 RNA-seq 数据转换方法比较。
Nat Methods. 2023 May;20(5):665-672. doi: 10.1038/s41592-023-01814-1. Epub 2023 Apr 10.
3
Batch alignment of single-cell transcriptomics data using deep metric learning.基于深度度量学习的单细胞转录组学数据批量对齐。
Immunol Cell Biol. 2025 Aug;103(7):723-737. doi: 10.1111/imcb.70040. Epub 2025 Jun 29.
4
Evaluation of inflammatory-thrombosis panel as a diagnostic tool for vascular Behçet's disease.评估炎症-血栓形成指标作为血管性白塞病的诊断工具
Clin Rheumatol. 2025 Mar;44(3):1279-1291. doi: 10.1007/s10067-025-07301-6. Epub 2025 Jan 31.
Nat Commun. 2023 Feb 21;14(1):960. doi: 10.1038/s41467-023-36635-5.
4
Application of Deep Learning on Single-cell RNA Sequencing Data Analysis: A Review.深度学习在单细胞 RNA 测序数据分析中的应用:综述。
Genomics Proteomics Bioinformatics. 2022 Oct;20(5):814-835. doi: 10.1016/j.gpb.2022.11.011. Epub 2022 Dec 14.
5
Fast and highly sensitive full-length single-cell RNA sequencing using FLASH-seq.使用 FLASH-seq 进行快速且高度灵敏的全长单细胞 RNA 测序。
Nat Biotechnol. 2022 Oct;40(10):1447-1451. doi: 10.1038/s41587-022-01312-3. Epub 2022 May 30.
6
Scalable single-cell RNA sequencing from full transcripts with Smart-seq3xpress.基于 Smart-seq3xpress 的全长转录本可扩展的单细胞 RNA 测序。
Nat Biotechnol. 2022 Oct;40(10):1452-1457. doi: 10.1038/s41587-022-01311-4. Epub 2022 May 30.
7
Molecular spikes: a gold standard for single-cell RNA counting.分子棘突:单细胞 RNA 计数的金标准。
Nat Methods. 2022 May;19(5):560-566. doi: 10.1038/s41592-022-01446-x. Epub 2022 Apr 25.
8
Comprehensive generation, visualization, and reporting of quality control metrics for single-cell RNA sequencing data.全面生成、可视化和报告单细胞 RNA 测序数据的质量控制指标。
Nat Commun. 2022 Mar 30;13(1):1688. doi: 10.1038/s41467-022-29212-9.
9
Deep learning shapes single-cell data analysis.深度学习重塑单细胞数据分析。
Nat Rev Mol Cell Biol. 2022 May;23(5):303-304. doi: 10.1038/s41580-022-00466-x.
10
Statistics or biology: the zero-inflation controversy about scRNA-seq data.统计学还是生物学:关于 scRNA-seq 数据的零膨胀争议。
Genome Biol. 2022 Jan 21;23(1):31. doi: 10.1186/s13059-022-02601-5.