• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

标准化方法对RNA测序数据分析的影响。

The Impact of Normalization Methods on RNA-Seq Data Analysis.

作者信息

Zyprych-Walczak J, Szabelska A, Handschuh L, Górczak K, Klamecka K, Figlerowicz M, Siatkowski I

机构信息

Department of Mathematical and Statistical Methods, Poznan University of Life Sciences, 60-637 Poznan, Poland.

Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland ; Department of Hematology and Bone Marrow Transplantation, Poznan University of Medical Sciences, 60-569 Poznan, Poland.

出版信息

Biomed Res Int. 2015;2015:621690. doi: 10.1155/2015/621690. Epub 2015 Jun 15.

DOI:10.1155/2015/621690
PMID:26176014
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4484837/
Abstract

High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably.

摘要

高通量测序技术,如Illumina Hi-seq,是用于研究广泛的生物学和医学问题的强大新工具。测序仪产生的海量复杂数据集催生了对能够处理数据分析和管理的统计及计算方法的需求。数据归一化是数据处理中最关键的步骤之一,由于它对分析结果有深远影响,因此必须仔细考虑。在这项工作中,我们着重对广泛用于转录组测序(RNA-seq)数据的与测序深度相关的五种归一化方法进行全面比较,以及它们对基因表达分析结果的影响。基于这项研究,我们提出了一个通用工作流程,可用于为任何特定数据集选择最佳归一化程序。所描述的工作流程包括计算对照基因的偏差和方差值、方法的灵敏度和特异性、分类错误以及生成诊断图。综合上述信息有助于为研究数据集选择最合适的归一化方法,并确定哪些方法可以互换使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f001/4484837/eff2a046e7b8/BMRI2015-621690.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f001/4484837/9ce09febd15e/BMRI2015-621690.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f001/4484837/eff2a046e7b8/BMRI2015-621690.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f001/4484837/9ce09febd15e/BMRI2015-621690.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f001/4484837/eff2a046e7b8/BMRI2015-621690.002.jpg

相似文献

1
The Impact of Normalization Methods on RNA-Seq Data Analysis.标准化方法对RNA测序数据分析的影响。
Biomed Res Int. 2015;2015:621690. doi: 10.1155/2015/621690. Epub 2015 Jun 15.
2
Normalization of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的归一化处理。
Methods Mol Biol. 2021;2284:303-329. doi: 10.1007/978-1-0716-1307-8_17.
3
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.Illumina 高通量 RNA 测序数据分析中标准化方法的综合评估。
Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17.
4
A Zipf-plot based normalization method for high-throughput RNA-seq data.基于 Zipf 分布的高通量 RNA-seq 数据标准化方法。
PLoS One. 2020 Apr 9;15(4):e0230594. doi: 10.1371/journal.pone.0230594. eCollection 2020.
5
Assessment of Single Cell RNA-Seq Normalization Methods.单细胞 RNA-Seq 归一化方法评估。
G3 (Bethesda). 2017 Jul 5;7(7):2039-2045. doi: 10.1534/g3.117.040683.
6
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.SPARTA:用于基于参考的细菌RNA测序转录组自动分析的简单程序。
BMC Bioinformatics. 2016 Feb 4;17:66. doi: 10.1186/s12859-016-0923-y.
7
Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies.RNA-seq 研究中平衡两组比较差异基因表达分析的库大小标准化和统计方法选择。
BMC Genomics. 2020 Jan 28;21(1):75. doi: 10.1186/s12864-020-6502-7.
8
Covariate-dependent negative binomial factor analysis of RNA sequencing data.基于协变量的 RNA 测序数据负二项式因子分析。
Bioinformatics. 2018 Jul 1;34(13):i61-i69. doi: 10.1093/bioinformatics/bty237.
9
Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench.使用UEA sRNA工作台对高通量小RNA测序数据进行全面处理,包括质量检查、标准化和差异表达分析。
RNA. 2017 Jun;23(6):823-835. doi: 10.1261/rna.059360.116. Epub 2017 Mar 13.
10
SCnorm: robust normalization of single-cell RNA-seq data.SCnorm:单细胞RNA测序数据的稳健归一化
Nat Methods. 2017 Jun;14(6):584-586. doi: 10.1038/nmeth.4263. Epub 2017 Apr 17.

引用本文的文献

1
Normalization and Selecting Non-Differentially Expressed Genes Improve Machine Learning Modelling of Cross-Platform Transcriptomic Data.归一化和选择非差异表达基因可改善跨平台转录组数据的机器学习建模
Trans Artif Intell. 2025;1(1). doi: 10.53941/tai.2025.100005. Epub 2025 May 25.
2
Identification and correction of time-series transcriptomic anomalies.时间序列转录组异常的识别与校正。
Nucleic Acids Res. 2025 Jun 20;53(12). doi: 10.1093/nar/gkaf524.
3
Exploring and mitigating shortcomings in single-cell differential expression analysis with a new statistical paradigm.

本文引用的文献

1
Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments: A matter of relative size of studied transcriptomes.RNA测序实验中差异基因表达分析的标准化方法比较:所研究转录组相对大小的问题
Commun Integr Biol. 2013 Nov 1;6(6):e25849. doi: 10.4161/cib.25849. Epub 2013 Jul 30.
2
Expression of putative targets of immunotherapy in acute myeloid leukemia and healthy tissues.急性髓系白血病及健康组织中免疫治疗潜在靶点的表达
Leukemia. 2014 May;28(5):1167-70. doi: 10.1038/leu.2014.14. Epub 2014 Jan 10.
3
Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells.
用一种新的统计范式探索和缓解单细胞差异表达分析中的缺点。
Genome Biol. 2025 Mar 17;26(1):58. doi: 10.1186/s13059-025-03525-6.
4
Normalization and selecting non-differentially expressed genes improve machine learning modelling of cross-platform transcriptomic data.标准化和选择非差异表达基因可改善跨平台转录组数据的机器学习建模。
ArXiv. 2025 Jan 24:arXiv:2501.14248v1.
5
Unlocking the power of multi-institutional data: Integrating and harmonizing genomic data across institutions.释放多机构数据的力量:整合与协调跨机构的基因组数据。
Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae146.
6
Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions.释放多机构数据的力量:跨机构整合与协调基因组数据
ArXiv. 2024 Oct 29:arXiv:2402.00077v2.
7
A benchmark of RNA-seq data normalization methods for transcriptome mapping on human genome-scale metabolic networks.基于人类基因组规模代谢网络的转录组映射的 RNA-seq 数据标准化方法基准测试。
NPJ Syst Biol Appl. 2024 Oct 24;10(1):124. doi: 10.1038/s41540-024-00448-z.
8
Homoeologs in Allopolyploids: Navigating Redundancy as Both an Evolutionary Opportunity and a Technical Challenge-A Transcriptomics Perspective.异源多倍体中的同源基因:从转录组学角度看,既是进化机遇也是技术挑战的冗余性。
Genes (Basel). 2024 Jul 24;15(8):977. doi: 10.3390/genes15080977.
9
siqRNA-seq is a spike-in-independent technique for quantitative mapping of mRNA landscape.siqRNA-seq 是一种用于定量绘制 mRNA 图谱的 Spike-in 独立技术。
BMC Genomics. 2024 Jul 30;25(1):743. doi: 10.1186/s12864-024-10650-2.
10
Normalization of RNA-Seq data using adaptive trimmed mean with multi-reference.使用自适应修剪均值和多参考对 RNA-Seq 数据进行标准化。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae241.
RNA-Seq 和微阵列在激活 T 细胞转录组谱分析中的比较。
PLoS One. 2014 Jan 16;9(1):e78644. doi: 10.1371/journal.pone.0078644. eCollection 2014.
4
Dispersion estimation and its effect on test performance in RNA-seq data analysis: a simulation-based comparison of methods.RNA-seq 数据分析中的分散估计及其对测试性能的影响:方法的基于模拟的比较。
PLoS One. 2013 Dec 9;8(12):e81415. doi: 10.1371/journal.pone.0081415. eCollection 2013.
5
Comparison of software packages for detecting differential expression in RNA-seq studies.RNA测序研究中用于检测差异表达的软件包比较。
Brief Bioinform. 2015 Jan;16(1):59-70. doi: 10.1093/bib/bbt086. Epub 2013 Dec 2.
6
Analysis of boutique arrays: a universal method for the selection of the optimal data normalization procedure. boutique 阵列分析:最优数据标准化程序选择的通用方法。
Int J Mol Med. 2013 Sep;32(3):668-84. doi: 10.3892/ijmm.2013.1443. Epub 2013 Jul 15.
7
Human housekeeping genes, revisited.人类管家基因,再探。
Trends Genet. 2013 Oct;29(10):569-74. doi: 10.1016/j.tig.2013.05.010. Epub 2013 Jun 27.
8
EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.EBSeq:RNA-seq 实验中用于推理的经验贝叶斯层次模型。
Bioinformatics. 2013 Apr 15;29(8):1035-43. doi: 10.1093/bioinformatics/btt087. Epub 2013 Feb 21.
9
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.Illumina 高通量 RNA 测序数据分析中标准化方法的综合评估。
Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17.
10
Detection of redundant fusion transcripts as biomarkers or disease-specific therapeutic targets in breast cancer.检测乳腺癌中冗余融合转录本作为生物标志物或疾病特异性治疗靶点。
Cancer Res. 2012 Apr 15;72(8):1921-8. doi: 10.1158/0008-5472.CAN-11-3142. Epub 2012 Apr 10.