• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

非平衡转录组数据分析的归一化方法综述

Normalization Methods for the Analysis of Unbalanced Transcriptome Data: A Review.

作者信息

Liu Xueyan, Li Nan, Liu Sheng, Wang Jun, Zhang Ning, Zheng Xubin, Leung Kwong-Sak, Cheng Lixin

机构信息

Department of Critical Care Medicine, Shenzhen People's Hospital, The Second Clinical Medicine College of Jinan University, Shenzhen, China.

Department of Stomatology Center, Shenzhen People's Hospital, Second Clinical Medicine College of Jinan University, Shenzhen, China.

出版信息

Front Bioeng Biotechnol. 2019 Nov 26;7:358. doi: 10.3389/fbioe.2019.00358. eCollection 2019.

DOI:10.3389/fbioe.2019.00358
PMID:32039167
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6988798/
Abstract

Dozens of normalization methods for correcting experimental variation and bias in high-throughput expression data have been developed during the last two decades. Up to 23 methods among them consider the skewness of expression data between sample states, which are even more than the conventional methods, such as loess and quantile. From the perspective of reference selection, we classified the normalization methods for skewed expression data into three categories, data-driven reference, foreign reference, and entire gene set. We separately introduced and summarized these normalization methods designed for gene expression data with global shift between compared conditions, including both microarray and RNA-seq, based on the reference selection strategies. To our best knowledge, this is the most comprehensive review of available preprocessing algorithms for the unbalanced transcriptome data. The anatomy and summarization of these methods shed light on the understanding and appropriate application of preprocessing methods.

摘要

在过去二十年中,已经开发出了数十种用于校正高通量表达数据中的实验变异和偏差的标准化方法。其中多达23种方法考虑了样本状态之间表达数据的偏度,这甚至比传统方法(如局部加权回归和分位数法)还要多。从参考选择的角度来看,我们将针对偏态表达数据的标准化方法分为三类:数据驱动参考、外部参考和全基因集。我们基于参考选择策略,分别介绍并总结了为比较条件之间存在全局偏移的基因表达数据(包括微阵列和RNA测序数据)设计的这些标准化方法。据我们所知,这是对可用的非平衡转录组数据预处理算法最全面的综述。这些方法的剖析和总结有助于深入理解和正确应用预处理方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aada/6988798/adb4da9f61be/fbioe-07-00358-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aada/6988798/adb4da9f61be/fbioe-07-00358-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aada/6988798/adb4da9f61be/fbioe-07-00358-g0001.jpg

相似文献

1
Normalization Methods for the Analysis of Unbalanced Transcriptome Data: A Review.非平衡转录组数据分析的归一化方法综述
Front Bioeng Biotechnol. 2019 Nov 26;7:358. doi: 10.3389/fbioe.2019.00358. eCollection 2019.
2
Optimal consistency in microRNA expression analysis using reference-gene-based normalization.使用基于参考基因标准化的方法在微小RNA表达分析中实现最佳一致性
Mol Biosyst. 2015 May;11(5):1235-40. doi: 10.1039/c4mb00711e.
3
scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data.scNPF:一种基于网络传播和网络融合的综合框架,用于单细胞 RNA-seq 数据的预处理。
BMC Genomics. 2019 May 8;20(1):347. doi: 10.1186/s12864-019-5747-5.
4
A comparison of normalization techniques for microRNA microarray data.微小RNA微阵列数据标准化技术的比较
Stat Appl Genet Mol Biol. 2008;7(1):Article22. doi: 10.2202/1544-6115.1287. Epub 2008 Jul 21.
5
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
6
Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster.使用来自726只黑腹果蝇个体的RNA测序数据进行标准化和差异表达分析的比较。
BMC Genomics. 2016 Jan 5;17:28. doi: 10.1186/s12864-015-2353-z.
7
RNA-sequence data normalization through in silico prediction of reference genes: the bacterial response to DNA damage as case study.通过参考基因的计算机模拟预测进行RNA序列数据归一化:以细菌对DNA损伤的反应为例
BioData Min. 2017 Sep 5;10:30. doi: 10.1186/s13040-017-0150-8. eCollection 2017.
8
Preprocessing differential methylation hybridization microarray data.预处理差异甲基化杂交微阵列数据。
BioData Min. 2011 May 16;4:13. doi: 10.1186/1756-0381-4-13.
9
Recurrent functional misinterpretation of RNA-seq data caused by sample-specific gene length bias.由于样本特异性基因长度偏差导致 RNA-seq 数据的功能解读反复出错。
PLoS Biol. 2019 Nov 12;17(11):e3000481. doi: 10.1371/journal.pbio.3000481. eCollection 2019 Nov.
10
Preprocessing Steps for Agilent MicroRNA Arrays: Does the Order Matter?安捷伦微小RNA芯片的预处理步骤:顺序重要吗?
Cancer Inform. 2015 Sep 3;13(Suppl 4):105-9. doi: 10.4137/CIN.S21630. eCollection 2014.

引用本文的文献

1
T-cell and autoantibody profiling for primary immune regulatory disorders.原发性免疫调节障碍的T细胞和自身抗体分析
J Allergy Clin Immunol. 2025 Jun 18. doi: 10.1016/j.jaci.2025.06.007.
2
Correcting scale distortion in RNA sequencing data.校正RNA测序数据中的比例失真。
BMC Bioinformatics. 2025 Jan 28;26(1):32. doi: 10.1186/s12859-025-06041-3.
3
Temporal Expression Analysis to Unravel Gene Regulatory Dynamics by microRNAs.通过微小RNA进行时间表达分析以揭示基因调控动态

本文引用的文献

1
Integrative analysis from multi-centre studies identifies a function-derived personalized multi-gene signature of outcome in colorectal cancer.多中心研究的综合分析确定了结直肠癌患者生存结局的基于功能的个体化多基因特征。
J Cell Mol Med. 2019 Aug;23(8):5270-5281. doi: 10.1111/jcmm.14403. Epub 2019 May 29.
2
Exploiting locational and topological overlap model to identify modules in protein interaction networks.利用位置和拓扑重叠模型识别蛋白质相互作用网络中的模块。
BMC Bioinformatics. 2019 Jan 14;20(1):23. doi: 10.1186/s12859-019-2598-7.
3
bcGST-an interactive bias-correction method to identify over-represented gene-sets in boutique arrays.
Methods Mol Biol. 2025;2883:325-341. doi: 10.1007/978-1-0716-4290-0_14.
4
Less is more: relative rank is more informative than absolute abundance for compositional NGS data.少即是多:对于组成性NGS数据,相对排名比绝对丰度更具信息量。
Brief Funct Genomics. 2025 Jan 15;24. doi: 10.1093/bfgp/elae045.
5
Pairwise analysis of gene expression for oral squamous cell carcinoma via a large-scale transcriptome integration.通过大规模转录组整合对口腔鳞状细胞癌的基因表达进行成对分析。
J Cell Mol Med. 2024 Oct;28(20):e70153. doi: 10.1111/jcmm.70153.
6
Assessing and mitigating batch effects in large-scale omics studies.评估和减轻大规模组学研究中的批次效应。
Genome Biol. 2024 Oct 3;25(1):254. doi: 10.1186/s13059-024-03401-9.
7
Forecasting and analyzing influenza activity in Hebei Province, China, using a CNN-LSTM hybrid model.利用 CNN-LSTM 混合模型预测和分析中国河北省的流感活动。
BMC Public Health. 2024 Aug 12;24(1):2171. doi: 10.1186/s12889-024-19590-8.
8
Prediction of Prostate Cancer Risk Stratification Based on A Nonlinear Transformation Stacking Learning Strategy.基于非线性变换堆叠学习策略的前列腺癌风险分层预测
Int Neurourol J. 2024 Mar;28(1):33-43. doi: 10.5213/inj.2346332.166. Epub 2024 Mar 31.
9
IMPRINTS.CETSA and IMPRINTS.CETSA.app: an R package and a Shiny application for the analysis and interpretation of IMPRINTS-CETSA data.IMPRINTS.CETSA 和 IMPRINTS.CETSA.app:一个用于分析和解释 IMPRINTS-CETSA 数据的 R 包和 Shiny 应用程序。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae128.
10
T cell and autoantibody profiling for primary immune regulatory disorders.原发性免疫调节障碍的T细胞和自身抗体分析
medRxiv. 2025 Jan 27:2024.02.25.24303331. doi: 10.1101/2024.02.25.24303331.
bcGST-一种交互式偏差校正方法,用于识别 boutique 阵列中过表达的基因集。
Bioinformatics. 2019 Apr 15;35(8):1350-1357. doi: 10.1093/bioinformatics/bty783.
4
Recurrence-Associated Long Non-coding RNA Signature for Determining the Risk of Recurrence in Patients with Colon Cancer.用于确定结肠癌患者复发风险的复发相关长链非编码RNA特征
Mol Ther Nucleic Acids. 2018 Sep 7;12:518-529. doi: 10.1016/j.omtn.2018.06.007. Epub 2018 Jun 26.
5
Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA-sequencing data.差异基因表达分析工具在长链非编码 RNA-seq 数据上的表现不佳。
Genome Biol. 2018 Jul 24;19(1):96. doi: 10.1186/s13059-018-1466-5.
6
Identification and characterization of moonlighting long non-coding RNAs based on RNA and protein interactome.基于 RNA 和蛋白质相互作用组鉴定和表征具有分子伴侣功能的长非编码 RNA。
Bioinformatics. 2018 Oct 15;34(20):3519-3528. doi: 10.1093/bioinformatics/bty399.
7
Analysis of long noncoding RNAs highlights region-specific altered expression patterns and diagnostic roles in Alzheimer's disease.长非编码 RNA 分析突出了阿尔茨海默病中区域特异性表达模式的改变和诊断作用。
Brief Bioinform. 2019 Mar 25;20(2):598-608. doi: 10.1093/bib/bby021.
8
SMILE: a novel procedure for subcellular module identification with localisation expansion.SMILE:一种通过定位扩展识别亚细胞模块的新方法。
IET Syst Biol. 2018 Apr;12(2):55-61. doi: 10.1049/iet-syb.2017.0085.
9
Quantification of non-coding RNA target localization diversity and its application in cancers.非编码 RNA 靶标定位多样性的定量及其在癌症中的应用。
J Mol Cell Biol. 2018 Apr 1;10(2):130-138. doi: 10.1093/jmcb/mjy006.
10
Gene expression profiling of acute myeloid leukemia samples from adult patients with AML-M1 and -M2 through boutique microarrays, real-time PCR and droplet digital PCR.通过 boutique 微阵列、实时 PCR 和液滴数字 PCR 对成人 AML-M1 和 -M2 急性髓细胞白血病样本进行基因表达谱分析。
Int J Oncol. 2018 Mar;52(3):656-678. doi: 10.3892/ijo.2017.4233. Epub 2017 Dec 28.