• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用观测数据对p值进行经验校准的局限性。

Limitations of empirical calibration of p-values using observational data.

作者信息

Gruber Susan, Tchetgen Tchetgen Eric

机构信息

Reagan-Udall Foundation for the Food and Drug Administration, Washington, DC, U.S.A.

Departments of Biostatistics and Epidemiology, Harvard School of Public Health, Boston, MA, U.S.A.

出版信息

Stat Med. 2016 Sep 30;35(22):3869-82. doi: 10.1002/sim.6936. Epub 2016 Mar 10.

DOI:10.1002/sim.6936
PMID:26970249
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5012943/
Abstract

Controversy over non-reproducible published research reporting a statistically significant result has produced substantial discussion in the literature. p-value calibration is a recently proposed procedure for adjusting p-values to account for both random and systematic errors that address one aspect of this problem. The method's validity rests on the key assumption that bias in an effect estimate is drawn from a normal distribution whose mean and variance can be correctly estimated. We investigated the method's control of type I and type II error rates using simulated and real-world data. Under mild violations of underlying assumptions, control of the type I error rate can be conservative, while under more extreme departures, it can be anti-conservative. The extent to which the assumption is violated in real-world data analyses is unknown. Barriers to testing the plausibility of the assumption using historical data are discussed. Our studies of the type II error rate using simulated and real-world electronic health care data demonstrated that calibrating p-values can substantially increase the type II error rate. The use of calibrated p-values may reduce the number of false-positive results, but there will be a commensurate drop in the ability to detect a true safety or efficacy signal. While p-value calibration can sometimes offer advantages in controlling the type I error rate, its adoption for routine use in studies of real-world health care datasets is premature. Separate characterizations of random and systematic errors provide a richer context for evaluating uncertainty surrounding effect estimates. Copyright © 2016 John Wiley & Sons, Ltd.

摘要

已发表的不可重复的研究报告了具有统计学意义的结果,这一争议在文献中引发了大量讨论。p值校准是最近提出的一种用于调整p值的程序,以考虑随机误差和系统误差,解决了这一问题的一个方面。该方法的有效性基于一个关键假设,即效应估计中的偏差来自正态分布,其均值和方差可以正确估计。我们使用模拟数据和实际数据研究了该方法对I型和II型错误率的控制情况。在对基本假设的轻微违反情况下,对I型错误率的控制可能较为保守,而在更极端的偏离情况下,可能会出现反保守情况。在实际数据分析中假设被违反的程度尚不清楚。讨论了使用历史数据检验假设合理性的障碍。我们使用模拟数据和实际电子医疗保健数据对II型错误率的研究表明,校准p值会大幅增加II型错误率。使用校准后的p值可能会减少假阳性结果的数量,但检测真实安全性或有效性信号的能力也会相应下降。虽然p值校准有时在控制I型错误率方面可能具有优势,但将其用于实际医疗保健数据集研究的常规应用还为时过早。对随机误差和系统误差的单独表征为评估效应估计周围的不确定性提供了更丰富的背景。版权所有© 2016约翰·威利父子有限公司。

相似文献

1
Limitations of empirical calibration of p-values using observational data.使用观测数据对p值进行经验校准的局限性。
Stat Med. 2016 Sep 30;35(22):3869-82. doi: 10.1002/sim.6936. Epub 2016 Mar 10.
2
Big data, observational research and P-value: a recipe for false-positive findings? A study of simulated and real prospective cohorts.大数据、观察性研究和 P 值:假阳性发现的秘诀?对模拟和真实前瞻性队列的研究。
Int J Epidemiol. 2020 Jun 1;49(3):876-884. doi: 10.1093/ije/dyz206.
3
P-values and decision-making: discussion of 'Limitations of empirical calibration of p-values using observational data'.P值与决策制定:对《使用观测数据进行P值的经验校准的局限性》的讨论
Stat Med. 2016 Sep 30;35(22):3889-91. doi: 10.1002/sim.6984.
4
Empirical confidence interval calibration for population-level effect estimation studies in observational healthcare data.基于观察性医疗保健数据的人群效应估计研究的经验置信区间校准。
Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2571-2577. doi: 10.1073/pnas.1708282114.
5
Adjusting for both sequential testing and systematic error in safety surveillance using observational data: Empirical calibration and MaxSPRT.利用观测数据调整安全性监测中的序贯检验和系统误差:经验校准和 MaxSPRT。
Stat Med. 2023 Feb 28;42(5):619-631. doi: 10.1002/sim.9631. Epub 2023 Jan 15.
6
Response to letter to the editor from Dr Rahman Shiri: The challenging topic of suicide across occupational groups.回复拉赫曼·希里博士的来信:职业群体中的自杀这一具有挑战性的话题。
Scand J Work Environ Health. 2018 Jan 1;44(1):108-110. doi: 10.5271/sjweh.3698. Epub 2017 Dec 8.
7
8
Evaluation of instrument error and method agreement.仪器误差与方法一致性评估。
AANA J. 1996 Jun;64(3):261-8.
9
Interpreting observational studies: why empirical calibration is needed to correct p-values.解读观察性研究:为何需要经验校准来修正 p 值。
Stat Med. 2014 Jan 30;33(2):209-18. doi: 10.1002/sim.5925. Epub 2013 Jul 30.
10
A comparison of entropy balance and probability weighting methods to generalize observational cohorts to a population: a simulation and empirical example.将观察性队列推广至总体的熵平衡法与概率加权法比较:模拟与实证示例
Pharmacoepidemiol Drug Saf. 2017 Apr;26(4):368-377. doi: 10.1002/pds.4121. Epub 2016 Nov 13.

引用本文的文献

1
Open science interventions to improve reproducibility and replicability of research: a scoping review.旨在提高研究可重复性和可复制性的开放科学干预措施:一项范围综述
R Soc Open Sci. 2025 Apr 9;12(4):242057. doi: 10.1098/rsos.242057. eCollection 2025 Apr.
2
Longitudinal profiling of the microbiome at four body sites reveals core stability and individualized dynamics during health and disease.对四个身体部位的微生物组进行纵向分析揭示了健康和疾病期间的核心稳定性及个体动态变化。
Cell Host Microbe. 2024 Apr 10;32(4):506-526.e9. doi: 10.1016/j.chom.2024.02.012. Epub 2024 Mar 12.
3
The State of Use and Utility of Negative Controls in Pharmacoepidemiologic Studies.

本文引用的文献

1
An investigation of the false discovery rate and the misinterpretation of p-values.对错误发现率和p值误读的调查。
R Soc Open Sci. 2014 Nov 19;1(3):140216. doi: 10.1098/rsos.140216. eCollection 2014 Nov.
2
The extent and consequences of p-hacking in science.科学中的 p-值操纵的程度和后果。
PLoS Biol. 2015 Mar 13;13(3):e1002106. doi: 10.1371/journal.pbio.1002106. eCollection 2015 Mar.
3
Normalization of RNA-seq data using factor analysis of control genes or samples.使用对照基因或样本的因子分析对RNA测序数据进行标准化。
药物流行病学研究中阴性对照的使用和实用性状况。
Am J Epidemiol. 2024 Feb 5;193(3):426-453. doi: 10.1093/aje/kwad201.
4
Causal models and causal modelling in obesity: foundations, methods and evidence.肥胖症中的因果模型和因果建模:基础、方法和证据。
Philos Trans R Soc Lond B Biol Sci. 2023 Oct 23;378(1888):20220227. doi: 10.1098/rstb.2022.0227. Epub 2023 Sep 4.
5
The optimal pre-post allocation for randomized clinical trials.随机临床试验的最优前后分配。
BMC Med Res Methodol. 2023 Mar 28;23(1):72. doi: 10.1186/s12874-023-01893-w.
6
Risk of retinal detachment and exposure to fluoroquinolones, common antibiotics, and febrile illness using a self-controlled case series study design: Retrospective analyses of three large healthcare databases in the US.应用自身对照病例系列研究设计评估常见抗生素氟喹诺酮类药物和发热性疾病与视网膜脱离风险:来自美国三个大型医疗保健数据库的回顾性分析。
PLoS One. 2022 Oct 6;17(10):e0275796. doi: 10.1371/journal.pone.0275796. eCollection 2022.
7
Assessing the effectiveness of empirical calibration under different bias scenarios.评估在不同偏差情况下经验校准的有效性。
BMC Med Res Methodol. 2022 Jul 27;22(1):208. doi: 10.1186/s12874-022-01687-6.
8
Comparative effectiveness over time of the mRNA-1273 (Moderna) vaccine and the BNT162b2 (Pfizer-BioNTech) vaccine.mRNA-1273(莫德纳)疫苗和 BNT162b2(辉瑞-生物科技)疫苗在时间上的比较效力。
Nat Commun. 2022 May 2;13(1):2377. doi: 10.1038/s41467-022-30059-3.
9
Risk of aortic aneurysm and dissection following exposure to fluoroquinolones, common antibiotics, and febrile illness using a self-controlled case series study design: Retrospective analyses of three large healthcare databases in the US.氟喹诺酮类药物、常见抗生素和发热性疾病暴露后发生主动脉瘤和夹层的风险:使用自我对照病例系列研究设计的美国三个大型医疗保健数据库的回顾性分析。
PLoS One. 2021 Aug 16;16(8):e0255887. doi: 10.1371/journal.pone.0255887. eCollection 2021.
10
The effectiveness of an oral opioid rescue medication algorithm for postoperative pain management compared to PCIA : A cohort analysis.口服阿片类药物解救药物方案与 PCIA 用于术后疼痛管理的效果比较:队列分析。
Anaesthesist. 2020 Sep;69(9):639-648. doi: 10.1007/s00101-020-00806-6. Epub 2020 Jul 2.
Nat Biotechnol. 2014 Sep;32(9):896-902. doi: 10.1038/nbt.2931. Epub 2014 Aug 24.
4
The control outcome calibration approach for causal inference with unobserved confounding.有未观测混杂时因果推断的控制结局校准方法。
Am J Epidemiol. 2014 Mar 1;179(5):633-40. doi: 10.1093/aje/kwt303. Epub 2013 Dec 20.
5
Sensitivity analysis for causal inference under unmeasured confounding and measurement error problems.未测量混杂和测量误差问题下因果推断的敏感性分析。
Int J Biostat. 2013 Nov 19;9(2):149-60. doi: 10.1515/ijb-2013-0004.
6
Discussion: An estimate of the science-wise false discovery rate and application to the top medical literature.讨论:按学科领域估计错误发现率及其在顶级医学文献中的应用。
Biostatistics. 2014 Jan;15(1):23-7; discussion 39-45. doi: 10.1093/biostatistics/kxt035. Epub 2013 Sep 25.
7
An estimate of the science-wise false discovery rate and application to the top medical literature.科学明智的假发现率估计及其在顶级医学文献中的应用。
Biostatistics. 2014 Jan;15(1):1-12. doi: 10.1093/biostatistics/kxt007. Epub 2013 Sep 25.
8
Interpreting observational studies: why empirical calibration is needed to correct p-values.解读观察性研究:为何需要经验校准来修正 p 值。
Stat Med. 2014 Jan 30;33(2):209-18. doi: 10.1002/sim.5925. Epub 2013 Jul 30.
9
Using control genes to correct for unwanted variation in microarray data.利用对照基因纠正微阵列数据中的非期望变异。
Biostatistics. 2012 Jul;13(3):539-52. doi: 10.1093/biostatistics/kxr034. Epub 2011 Nov 17.
10
Negative controls: a tool for detecting confounding and bias in observational studies.阴性对照:一种用于检测观察性研究中混杂和偏倚的工具。
Epidemiology. 2010 May;21(3):383-8. doi: 10.1097/EDE.0b013e3181d61eeb.