编辑可以引导研究人员关注置信区间，但无法促使他们思考：来自医学领域的统计改革经验教训。

Editors can lead researchers to confidence intervals, but can't make them think: statistical reform lessons from medicine.

作者信息

Fidler Fiona, Thomason Neil, Cumming Geoff, Finch Sue, Leeman Joanna

机构信息

La Trobe University, Melbourne, Australia.

出版信息

Psychol Sci. 2004 Feb;15(2):119-26. doi: 10.1111/j.0963-7214.2004.01502008.x.

DOI:10.1111/j.0963-7214.2004.01502008.x

PMID:14738519

Abstract

Since the mid-1980s, confidence intervals (CIs) have been standard in medical journals. We sought lessons for psychology from medicine's experience with statistical reform by investigating two attempts by Kenneth Rothman to change statistical practices. We examined 594 American Journal of Public Health (AJPH) articles published between 1982 and 2000 and 110 Epidemiology articles published in 1990 and 2000. Rothman's editorial instruction to report CIs and not p values was largely effective: In AJPH, sole reliance on p values dropped from 63% to 5%, and CI reporting rose from 10% to 54%; Epidemiology showed even stronger compliance. However, compliance was superficial: Very few authors referred to CIs when discussing results. The results of our survey support what other research has indicated: Editorial policy alone is not a sufficient mechanism for statistical reform. Achieving substantial, desirable change will require further guidance regarding use and interpretation of CIs and appropriate effect size measures. Necessary steps will include studying researchers' understanding of CIs, improving education, and developing empirically justified recommendations for improved statistical practice.

摘要

自20世纪80年代中期以来，置信区间（CI）在医学期刊中已成为标准。我们通过调查肯尼斯·罗斯曼为改变统计实践所做的两次尝试，从医学的统计改革经验中探寻心理学可借鉴的经验教训。我们查阅了1982年至2000年间发表在《美国公共卫生杂志》（AJPH）上的594篇文章，以及1990年和2000年发表在《流行病学》上的110篇文章。罗斯曼关于报告置信区间而非p值的编辑指令在很大程度上是有效的：在《美国公共卫生杂志》上，仅依赖p值的情况从63%降至5%，而报告置信区间的情况从10%升至54%；《流行病学》的情况显示出更强的依从性。然而，这种依从性只是表面的：很少有作者在讨论结果时提及置信区间。我们的调查结果支持了其他研究表明的观点：仅靠编辑政策不足以实现统计改革。要实现实质性的、理想的变革，将需要关于置信区间的使用和解释以及适当效应量测量的进一步指导。必要的步骤将包括研究研究人员对置信区间的理解、改进教育，并为改进统计实践制定基于实证的合理建议。

相似文献

Editors can lead researchers to confidence intervals, but can't make them think: statistical reform lessons from medicine.编辑可以引导研究人员关注置信区间，但无法促使他们思考：来自医学领域的统计改革经验教训。

Psychol Sci. 2004 Feb;15(2):119-26. doi: 10.1111/j.0963-7214.2004.01502008.x.

Confidence intervals for effect sizes: compliance and clinical significance in the Journal of Consulting and clinical Psychology.效度量的置信区间：《咨询与临床心理学杂志》中的一致性和临床意义。

J Consult Clin Psychol. 2010 Jun;78(3):287-97. doi: 10.1037/a0019294.

Toward improved statistical reporting in the journal of consulting and clinical psychology.致力于改进《咨询与临床心理学杂志》中的统计报告。

J Consult Clin Psychol. 2005 Feb;73(1):136-43. doi: 10.1037/0022-006X.73.1.136.

An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature.当前内科医学文献中使用的统计方法的多样性和复杂性调查。

South Med J. 2015 Oct;108(10):629-34. doi: 10.14423/SMJ.0000000000000354.

[Statistical and epidemiological methods used in biomedical research: implications for initial medical education].生物医学研究中使用的统计和流行病学方法：对医学初始教育的影响

Rev Epidemiol Sante Publique. 2013 Jun;61(3):261-8. doi: 10.1016/j.respe.2012.11.002. Epub 2013 Apr 30.

Can't Get No Reproduction: Leading Researchers Discuss the Problem of Irreproducible Results.无法实现再现：顶尖研究人员探讨不可重复结果的问题。

Circ Res. 2015 Sep 25;117(8):667-70. doi: 10.1161/CIRCRESAHA.115.307532.

How often do leading biomedical journals use statistical experts to evaluate statistical methods? The results of a survey.主流生物医学期刊多久会使用统计专家来评估统计方法？一项调查的结果。

PLoS One. 2020 Oct 1;15(10):e0239598. doi: 10.1371/journal.pone.0239598. eCollection 2020.

Common statistical and research design problems in manuscripts submitted to high-impact psychiatry journals: what editors and reviewers want authors to know.提交给高影响力精神病学杂志的稿件中常见的统计和研究设计问题：编辑和审稿人希望作者了解的内容。

J Psychiatr Res. 2009 Oct;43(15):1231-4. doi: 10.1016/j.jpsychires.2009.04.007. Epub 2009 May 10.

Advertising in dermatology journals: journals' and journal editors' policies, practices, and attitudes.皮肤科期刊中的广告：期刊及期刊编辑的政策、做法和态度。

J Am Acad Dermatol. 2006 Jul;55(1):116-22. doi: 10.1016/j.jaad.2006.01.046.

Methods of reporting statistical results from medical research studies.医学研究统计结果的报告方法。

Am J Epidemiol. 1995 May 15;141(10):896-906. doi: 10.1093/oxfordjournals.aje.a117356.

引用本文的文献

Alternative to the statistical mass confusion of testing for "no effect".替代对“无效应”进行检验时的统计大规模混淆。

J Cell Biol. 2025 Aug 4;224(8). doi: 10.1083/jcb.202403034. Epub 2025 Jul 23.

Transdiagnostic Symptom Domains Have Distinct Patterns of Association With Head Motion During Multimodal Imaging in Children.在儿童多模态成像过程中，跨诊断症状领域与头部运动具有不同的关联模式。

Biol Psychiatry Glob Open Sci. 2025 Apr 17;5(4):100506. doi: 10.1016/j.bpsgos.2025.100506. eCollection 2025 Jul.

Alternative to the statistical mass confusion of testing for "no effect".用于检验“无效应”的统计学大规模混淆的替代方法。

ArXiv. 2025 May 12:arXiv:2407.07114v3.

Linear regression reporting practices for health researchers, a cross-sectional meta-research study.健康研究人员的线性回归报告实践：一项横断面元研究

PLoS One. 2025 Mar 20;20(3):e0305150. doi: 10.1371/journal.pone.0305150. eCollection 2025.

Reporting of confidence intervals, achievement of intended sample size, and adjustment for multiple primary outcomes in randomised trials of physical therapy interventions: an analysis of 100 representatively sampled trials.物理治疗干预随机试验中报告置信区间、实现预期样本量和调整多个主要结局：对 100 个具有代表性样本试验的分析。

Braz J Phys Ther. 2024 May-Jun;28(3):101079. doi: 10.1016/j.bjpt.2024.101079. Epub 2024 May 21.

Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.重新定义统计学显著性对 P 值操纵和假阳性率的影响：基于代理的模型。

PLoS One. 2024 May 16;19(5):e0303262. doi: 10.1371/journal.pone.0303262. eCollection 2024.

Encouraging responsible reporting practices in the Instructions to Authors of neuroscience and physiology journals: There is room to improve.鼓励神经科学和生理学期刊《作者须知》中负责任的报告实践：有改进的空间。

PLoS One. 2023 Mar 30;18(3):e0283753. doi: 10.1371/journal.pone.0283753. eCollection 2023.

Quality Output Checklist and Content Assessment (QuOCCA): a new tool for assessing research quality and reproducibility.质量产出检查表和内容评估（QuOCCA）：一种评估研究质量和可重复性的新工具。

BMJ Open. 2022 Sep 26;12(9):e060976. doi: 10.1136/bmjopen-2022-060976.

Statistical inference through estimation: recommendations from the International Society of Physiotherapy Journal Editors.通过估计进行统计推断：来自国际物理治疗期刊编辑协会的建议。

J Man Manip Ther. 2022 Jun;30(3):133-138. doi: 10.1080/10669817.2022.2071980.

Reporting of Statistical Inference in Abstracts of Major Cancer Journals, 1990 to 2020.1990年至2020年主要癌症期刊摘要中的统计推断报告

JAMA Netw Open. 2022 Jun 1;5(6):e2218337. doi: 10.1001/jamanetworkopen.2022.18337.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

编辑可以引导研究人员关注置信区间，但无法促使他们思考：来自医学领域的统计改革经验教训。

Editors can lead researchers to confidence intervals, but can't make them think: statistical reform lessons from medicine.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献