在三种定量组学实验模型中，流行值校正方法的成本和收益。

Costs and Benefits of Popular -Value Correction Methods in Three Models of Quantitative Omic Experiments.

机构信息

Department of Chemistry, Stanford University, 364 Lomita Dr., Stanford, California94305, United States.

Department of Neurology and Neurological Sciences, Stanford University School of Medicine, 291 Campus Dr., Stanford, California94305, United States.

出版信息

Anal Chem. 2023 Feb 7;95(5):2732-2740. doi: 10.1021/acs.analchem.2c03719. Epub 2023 Jan 24.

DOI:10.1021/acs.analchem.2c03719

PMID:36693222

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10653731/

Abstract

The multiple hypothesis testing problem is inherent in large-scale quantitative "omic" experiments such as mass spectrometry-based proteomics. Yet, tools for comparing the costs and benefits of different -value correction methods under different experimental conditions are lacking. We performed thousands of simulations of omic experiments under a range of experimental conditions and applied correction using the Benjamini-Hochberg (BH), Bonferroni, and permutation-based false discovery proportion (FDP) estimation methods. The tremendous false discovery rate (FDR) benefit of correction was confirmed in a range of different contexts. No correction method can guarantee a low FDP in a single experiment, but the probability of a high FDP is small when a high number and proportion of corrected -values are significant. On average, correction decreased sensitivity, but the sensitivity costs of BH and permutation were generally modest compared to the FDR benefits. In a given experiment, observed sensitivity was always maintained or decreased by BH and Bonferroni, whereas it was often increased by permutation. Overall, permutation had better FDR and sensitivity than BH. We show how increasing sample size, decreasing variability, or increasing effect size can enable the detection of all true changes while still correcting -values, and we present basic guidelines for omic experimental design. Analysis of an experimental proteomic data set with defined changes corroborated these trends. We developed an R Shiny web application for further exploration and visualization of these models, which we call the Simulator of -value Multiple Hypothesis Correction (SIMPLYCORRECT) and a high-performance R package, permFDP, for easy use of the permutation-based FDP estimation method.

摘要

多重假设检验问题是基于质谱的蛋白质组学等大规模定量“组学”实验所固有的。然而，缺乏用于比较不同实验条件下不同 p 值校正方法的成本和效益的工具。我们在一系列实验条件下对组学实验进行了数千次模拟，并使用 Benjamini-Hochberg（BH）、Bonferroni 和基于置换的虚假发现率（FDP）估计方法进行校正。在各种不同的情况下，校正极大地降低了错误发现率（FDR）。没有一种校正方法可以保证在单个实验中具有低 FDP，但当大量校正的 p 值显著时，高 FDP 的概率很小。平均而言，校正降低了灵敏度，但 BH 和置换的灵敏度成本通常与 FDR 收益相比适度。在给定的实验中，BH 和 Bonferroni 始终保持或降低了观察到的灵敏度，而置换则经常增加了灵敏度。总体而言，置换的 FDR 和灵敏度优于 BH。我们展示了如何通过增加样本量、降低变异性或增加效应大小来实现检测所有真实变化的同时仍校正 p 值，并为组学实验设计提供了基本准则。对具有定义变化的实验蛋白质组学数据集的分析证实了这些趋势。我们开发了一个 R Shiny 网络应用程序，用于进一步探索和可视化这些模型，我们称之为 p 值多重假设校正模拟器（SIMPLYCORRECT），以及一个高性能的 R 包 permFDP，用于轻松使用基于置换的 FDP 估计方法。

相似文献

Costs and Benefits of Popular -Value Correction Methods in Three Models of Quantitative Omic Experiments.在三种定量组学实验模型中，流行值校正方法的成本和收益。

Anal Chem. 2023 Feb 7;95(5):2732-2740. doi: 10.1021/acs.analchem.2c03719. Epub 2023 Jan 24.

A new estimation of protein-level false discovery rate.一种新的蛋白质水平假发现率估计方法。

BMC Genomics. 2018 Aug 13;19(Suppl 6):567. doi: 10.1186/s12864-018-4923-3.

A general method for accurate estimation of false discovery rates in identification of differentially expressed genes.一种用于准确估计差异表达基因识别中错误发现率的通用方法。

Bioinformatics. 2014 Jul 15;30(14):2018-25. doi: 10.1093/bioinformatics/btu124. Epub 2014 Mar 14.

Unbiased False Discovery Rate Estimation for Shotgun Proteomics Based on the Target-Decoy Approach.基于目标-诱饵法的鸟枪法蛋白质组学无偏错误发现率估计

J Proteome Res. 2017 Feb 3;16(2):393-397. doi: 10.1021/acs.jproteome.6b00144. Epub 2016 Dec 13.

Permutation - based statistical tests for multiple hypotheses.用于多重假设的基于排列的统计检验。

Source Code Biol Med. 2008 Oct 21;3:15. doi: 10.1186/1751-0473-3-15.

False discovery rate and permutation test: an evaluation in ERP data analysis.错误发现率和置换检验：在 ERP 数据分析中的评估。

Stat Med. 2010 Jan 15;29(1):63-74. doi: 10.1002/sim.3784.

Fast and covariate-adaptive method amplifies detection power in large-scale multiple hypothesis testing.快速且协变量自适应的方法可提高大规模多重假设检验中的检测能力。

Nat Commun. 2019 Jul 31;10(1):3433. doi: 10.1038/s41467-019-11247-0.

An automated proteomic data analysis workflow for mass spectrometry.用于质谱的自动化蛋白质组学数据分析工作流程。

BMC Bioinformatics. 2009 Oct 8;10 Suppl 11(Suppl 11):S17. doi: 10.1186/1471-2105-10-S11-S17.

A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data.关于使用基于排列的错误发现率估计来比较微阵列数据不同分析方法的说明。

Bioinformatics. 2005 Dec 1;21(23):4280-8. doi: 10.1093/bioinformatics/bti685. Epub 2005 Sep 27.

Statistical Evaluation of Labeled Comparative Profiling Proteomics Experiments Using Permutation Test.使用置换检验对标记比较蛋白质组学实验进行统计评估。

Methods Mol Biol. 2017;1549:109-117. doi: 10.1007/978-1-4939-6740-7_9.

引用本文的文献

Improved Antimicrobial Properties of White Wastewater Protein Hydrolysate Through Electrodialysis with an Ultrafiltration Membrane (EDUF).通过超滤膜电渗析（EDUF）提高白色废水蛋白水解物的抗菌性能。

Membranes (Basel). 2025 Aug 6;15(8):238. doi: 10.3390/membranes15080238.

Effects of Antibiotic Residues on Fecal Microbiota Composition and Antimicrobial Resistance Gene Profiles in Cattle from Northwestern China.抗生素残留对中国西北牛粪便微生物群组成及抗菌抗性基因谱的影响

Microorganisms. 2025 Jul 14;13(7):1658. doi: 10.3390/microorganisms13071658.

Causal associations between gut microbiota and rheumatoid arthritis: A two-sample Mendelian randomization study.肠道微生物群与类风湿性关节炎之间的因果关联：一项两样本孟德尔随机化研究。

Medicine (Baltimore). 2025 May 30;104(22):e42596. doi: 10.1097/MD.0000000000042596.

Fiber Type-Specific Adaptations to Exercise Training in Human Skeletal Muscle: Lessons From Proteome Analyses and Future Directions.人类骨骼肌纤维类型对运动训练的特异性适应：蛋白质组分析的启示与未来方向

Scand J Med Sci Sports. 2025 May;35(5):e70059. doi: 10.1111/sms.70059.

DisCo P-ad: Distance-Correlation-Based -Value Adjustment Enhances Multiple Testing Corrections for Metabolomics.DisCo P值调整：基于距离相关性的值调整增强了代谢组学的多重检验校正

Metabolites. 2025 Jan 8;15(1):28. doi: 10.3390/metabo15010028.

Intraoperative Sleep Spindle Activity and Postoperative Sleep Disturbance in Elderly Patients Undergoing Orthopedic Surgery: A Prospective Cohort Study.骨科手术老年患者术中睡眠纺锤波活动与术后睡眠障碍：一项前瞻性队列研究

Nat Sci Sleep. 2024 Dec 17;16:2083-2097. doi: 10.2147/NSS.S486625. eCollection 2024.

Network-based modelling reveals cell-type enriched patterns of non-coding RNA regulation during human skeletal muscle remodelling.基于网络的建模揭示了人类骨骼肌重塑过程中非编码RNA调控的细胞类型富集模式。

NAR Mol Med. 2024 Oct 22;1(4):ugae016. doi: 10.1093/narmme/ugae016. eCollection 2024 Oct.

bioRxiv. 2024 Oct 9:2024.08.11.606848. doi: 10.1101/2024.08.11.606848.

Common and Key Differential Pathogenic Pathways in Desminopathy and Titinopathy.桥粒病和肌联蛋白病中的常见和关键差异致病途径。

Int J Med Sci. 2024 Aug 1;21(11):2040-2051. doi: 10.7150/ijms.97797. eCollection 2024.

The causal effect of gut microbiota on hepatic encephalopathy: a mendelian randomization analysis.肠道微生物群对肝性脑病的因果关系：一项基于孟德尔随机化的分析。

BMC Med Genomics. 2024 Aug 19;17(1):216. doi: 10.1186/s12920-024-01939-y.

本文引用的文献

Limited Proteolysis-Mass Spectrometry Reveals Aging-Associated Changes in Cerebrospinal Fluid Protein Abundances and Structures.有限蛋白水解-质谱分析揭示了脑脊液蛋白丰度和结构与衰老相关的变化。

Nat Aging. 2022 May;2(5):379-388. doi: 10.1038/s43587-022-00196-x. Epub 2022 Apr 11.

Find the Needle in the Haystack, Then Find It Again: Replication and Validation in the 'Omics Era.大海捞针，然后再次找到它：“组学”时代的重复研究与验证

Metabolites. 2020 Jul 12;10(7):286. doi: 10.3390/metabo10070286.

Large-scale proteomic analysis of Alzheimer's disease brain and cerebrospinal fluid reveals early changes in energy metabolism associated with microglia and astrocyte activation.对阿尔茨海默病大脑和脑脊液的大规模蛋白质组学分析揭示了与小胶质细胞和星形胶质细胞激活相关的能量代谢的早期变化。

Nat Med. 2020 May;26(5):769-780. doi: 10.1038/s41591-020-0815-6. Epub 2020 Apr 13.

A practical guide to methods controlling false discoveries in computational biology.计算生物学中控制假发现方法的实用指南。

Genome Biol. 2019 Jun 4;20(1):118. doi: 10.1186/s13059-019-1716-1.

Characterization and Optimization of Multiplexed Quantitative Analyses Using High-Field Asymmetric-Waveform Ion Mobility Mass Spectrometry.使用高场非对称波离子淌度质谱对多重定量分析进行表征和优化。

Anal Chem. 2019 Mar 19;91(6):4010-4016. doi: 10.1021/acs.analchem.8b05399. Epub 2019 Feb 26.

Comparing phenotypic variation between inbred and outbred mice.比较近交系和杂交系小鼠之间的表型变异。

Nat Methods. 2018 Dec;15(12):994-996. doi: 10.1038/s41592-018-0224-7.

Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation.控制程序和虚假发现率的估计及其在低维环境中的应用：实证研究。

BMC Bioinformatics. 2018 Mar 2;19(1):78. doi: 10.1186/s12859-018-2081-x.

The Perseus computational platform for comprehensive analysis of (prote)omics data.Perseus 计算平台，用于全面分析（蛋白质组学）数据。

Nat Methods. 2016 Sep;13(9):731-40. doi: 10.1038/nmeth.3901. Epub 2016 Jun 27.

Reproducibility crisis: Blame it on the antibodies.可重复性危机：归咎于抗体。

Nature. 2015 May 21;521(7552):274-6. doi: 10.1038/521274a.

The extent and consequences of p-hacking in science.科学中的 p-值操纵的程度和后果。

PLoS Biol. 2015 Mar 13;13(3):e1002106. doi: 10.1371/journal.pbio.1002106. eCollection 2015 Mar.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验