• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

稳健尾部分位数归一化。

Tail-Robust Quantile Normalization.

机构信息

Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, 79104, Freiburg, Germany.

Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, 79104, Freiburg, Germany.

出版信息

Proteomics. 2020 Dec;20(24):e2000068. doi: 10.1002/pmic.202000068. Epub 2020 Oct 7.

DOI:10.1002/pmic.202000068
PMID:32865322
Abstract

High-throughput biological data-such as mass spectrometry (MS)-based proteomics data-suffer from systematic non-biological variance due to systematic errors. This hinders the estimation of "real" biological signals and, in turn, decreases the power of statistical tests and biases the identification of differentially expressed proteins. To remove such unintended variation, while retaining the biological signal of interest, analysis workflows for quantitative MS data typically comprise normalization prior to their statistical analysis. Several normalization methods, such as quantile normalization (QN), have originally been developed for microarray data. In contrast to microarray data proteomics data may contain features, in the form of protein intensities that are consistently high across experimental conditions and, hence, are encountered in the tails of the protein intensity distribution. If QN is applied in the presence of such proteins statistical inferences of the features' intensity profiles are impeded due to the biased estimation of their variance. A freely available, novel approach is introduced which serves as an improvement of the classical QN by preserving the biological signals of features in the tails of the intensity distribution and by accounting for sample-dependent missing values (MVs): The "tail-robust quantile normalization" (TRQN).

摘要

高通量生物数据,如基于质谱(MS)的蛋白质组学数据,由于系统误差而受到系统性的非生物学变异的影响。这阻碍了“真实”生物信号的估计,进而降低了统计检验的功效,并偏向了差异表达蛋白的鉴定。为了去除这种非预期的变化,同时保留感兴趣的生物学信号,定量 MS 数据的分析工作流程通常在进行统计分析之前进行归一化。几种归一化方法,如分位数归一化(QN),最初是为微阵列数据开发的。与微阵列数据相比,蛋白质组学数据可能包含以蛋白质强度形式出现的特征,这些特征在实验条件下始终保持较高水平,因此出现在蛋白质强度分布的尾部。如果在存在这种蛋白质的情况下应用 QN,则由于其方差的有偏估计,对特征强度分布尾部的统计推断会受到阻碍。本文介绍了一种免费的新方法,它通过保留强度分布尾部的特征的生物学信号,并考虑样本相关的缺失值(MV),对经典 QN 进行了改进:“尾部稳健分位数归一化”(TRQN)。

相似文献

1
Tail-Robust Quantile Normalization.稳健尾部分位数归一化。
Proteomics. 2020 Dec;20(24):e2000068. doi: 10.1002/pmic.202000068. Epub 2020 Oct 7.
2
Normalization and missing value imputation for label-free LC-MS analysis.无标记 LC-MS 分析的归一化和缺失值插补。
BMC Bioinformatics. 2012;13 Suppl 16(Suppl 16):S5. doi: 10.1186/1471-2105-13-S16-S5. Epub 2012 Nov 5.
3
Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics.用于消除与质谱分析和无标记蛋白质组学相关的系统偏差的归一化方法。
J Proteome Res. 2006 Feb;5(2):277-86. doi: 10.1021/pr050300l.
4
Quantile normalization approach for liquid chromatography-mass spectrometry-based metabolomic data from healthy human volunteers.基于液相色谱-质谱联用技术的健康人类志愿者代谢组学数据的分位数标准化方法。
Anal Sci. 2012;28(8):801-5. doi: 10.2116/analsci.28.801.
5
A multi-model statistical approach for proteomic spectral count quantitation.一种用于蛋白质组学光谱计数定量的多模型统计方法。
J Proteomics. 2016 Jul 20;144:23-32. doi: 10.1016/j.jprot.2016.05.032. Epub 2016 May 31.
6
Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data.使用广义普罗克汝斯分析(GPA)对cDNA微阵列数据进行标准化处理。
BMC Bioinformatics. 2008 Jan 16;9:25. doi: 10.1186/1471-2105-9-25.
7
Smooth quantile normalization.平滑分位数归一化
Biostatistics. 2018 Apr 1;19(2):185-198. doi: 10.1093/biostatistics/kxx028.
8
RobNorm: model-based robust normalization method for labeled quantitative mass spectrometry proteomics data.RobNorm:基于模型的有标记定量质谱蛋白质组学数据稳健归一化方法。
Bioinformatics. 2021 May 5;37(6):815-821. doi: 10.1093/bioinformatics/btaa904.
9
How to do quantile normalization correctly for gene expression data analyses.如何正确地对基因表达数据分析进行分位数归一化。
Sci Rep. 2020 Sep 23;10(1):15534. doi: 10.1038/s41598-020-72664-6.
10
Subset quantile normalization using negative control features.使用阴性对照特征的子集分位数标准化
J Comput Biol. 2010 Oct;17(10):1385-95. doi: 10.1089/cmb.2010.0049.

引用本文的文献

1
Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference.通过高性能规则和集成推理优化蛋白质组学数据的差异表达分析。
Nat Commun. 2024 May 9;15(1):3922. doi: 10.1038/s41467-024-47899-w.
2
MSPypeline: a python package for streamlined data analysis of mass spectrometry-based proteomics.MSPypeline:一个用于简化基于质谱的蛋白质组学数据分析的Python软件包。
Bioinform Adv. 2022 Jan 17;2(1):vbac004. doi: 10.1093/bioadv/vbac004. eCollection 2022.
3
Adverse Effects of COVID-19 Vaccination: Machine Learning and Statistical Approach to Identify and Classify Incidences of Morbidity and Postvaccination Reactogenicity.
新冠疫苗接种的不良反应:用于识别和分类发病情况及接种后反应原性的机器学习和统计方法
Healthcare (Basel). 2022 Dec 22;11(1):31. doi: 10.3390/healthcare11010031.
4
Optimized sample preparation and data analysis for TMT proteomic analysis of cerebrospinal fluid applied to the identification of Alzheimer's disease biomarkers.用于脑脊液TMT蛋白质组学分析以鉴定阿尔茨海默病生物标志物的优化样品制备和数据分析。
Clin Proteomics. 2022 May 14;19(1):13. doi: 10.1186/s12014-022-09354-0.
5
Benchmarking of analysis strategies for data-independent acquisition proteomics using a large-scale dataset comprising inter-patient heterogeneity.使用包含患者间异质性的大规模数据集对数据非依赖性采集蛋白质组学分析策略进行基准测试。
Nat Commun. 2022 May 12;13(1):2622. doi: 10.1038/s41467-022-30094-0.