• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估单一供应商常用临床预测模型报告指南的依从性:系统评价。

Assessment of Adherence to Reporting Guidelines by Commonly Used Clinical Prediction Models From a Single Vendor: A Systematic Review.

机构信息

Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California.

Department of Pediatrics, Stanford University School of Medicine, Stanford, California.

出版信息

JAMA Netw Open. 2022 Aug 1;5(8):e2227779. doi: 10.1001/jamanetworkopen.2022.27779.

DOI:10.1001/jamanetworkopen.2022.27779
PMID:35984654
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9391954/
Abstract

IMPORTANCE

Various model reporting guidelines have been proposed to ensure clinical prediction models are reliable and fair. However, no consensus exists about which model details are essential to report, and commonalities and differences among reporting guidelines have not been characterized. Furthermore, how well documentation of deployed models adheres to these guidelines has not been studied.

OBJECTIVES

To assess information requested by model reporting guidelines and whether the documentation for commonly used machine learning models developed by a single vendor provides the information requested.

EVIDENCE REVIEW

MEDLINE was queried using machine learning model card and reporting machine learning from November 4 to December 6, 2020. References were reviewed to find additional publications, and publications without specific reporting recommendations were excluded. Similar elements requested for reporting were merged into representative items. Four independent reviewers and 1 adjudicator assessed how often documentation for the most commonly used models developed by a single vendor reported the items.

FINDINGS

From 15 model reporting guidelines, 220 unique items were identified that represented the collective reporting requirements. Although 12 items were commonly requested (requested by 10 or more guidelines), 77 items were requested by just 1 guideline. Documentation for 12 commonly used models from a single vendor reported a median of 39% (IQR, 37%-43%; range, 31%-47%) of items from the collective reporting requirements. Many of the commonly requested items had 100% reporting rates, including items concerning outcome definition, area under the receiver operating characteristics curve, internal validation, and intended clinical use. Several items reported half the time or less related to reliability, such as external validation, uncertainty measures, and strategy for handling missing data. Other frequently unreported items related to fairness (summary statistics and subgroup analyses, including for race and ethnicity or sex).

CONCLUSIONS AND RELEVANCE

These findings suggest that consistent reporting recommendations for clinical predictive models are needed for model developers to share necessary information for model deployment. The many published guidelines would, collectively, require reporting more than 200 items. Model documentation from 1 vendor reported the most commonly requested items from model reporting guidelines. However, areas for improvement were identified in reporting items related to model reliability and fairness. This analysis led to feedback to the vendor, which motivated updates to the documentation for future users.

摘要

重要性

已经提出了各种模型报告指南,以确保临床预测模型是可靠和公平的。然而,对于哪些模型细节是报告所必需的,尚未达成共识,并且报告指南之间的共同点和差异尚未得到描述。此外,尚未研究部署模型的文档对这些指南的遵守情况。

目的

评估模型报告指南中所要求的信息,以及单个供应商开发的常用机器学习模型的文档是否提供了所要求的信息。

证据回顾

使用机器学习模型卡和报告机器学习于 2020 年 11 月 4 日至 12 月 6 日在 MEDLINE 中进行了查询。审查了参考文献以找到其他出版物,并排除了没有具体报告建议的出版物。为报告而合并的相似要求被合并为代表性项目。四名独立评审员和一名裁决员评估了单个供应商开发的最常用模型的文档报告了多少项内容。

发现

从 15 项模型报告指南中,确定了 220 个独特的项目,这些项目代表了集体报告要求。虽然有 12 项要求是共同要求(被 10 项或更多指南要求),但还有 77 项只被 1 项指南要求。单个供应商的 12 种常用模型的文档报告了 39%(中位数,IQR,37%-43%;范围,31%-47%)的集体报告要求的项目。许多常用的要求项有 100%的报告率,包括与结局定义、接收者操作特征曲线下面积、内部验证和预期临床用途有关的项目。一些与可靠性相关的项目报告率为一半或更少,例如外部验证、不确定性度量和处理缺失数据的策略。其他与公平性相关的频繁未报告项目(摘要统计数据和亚组分析,包括种族和民族或性别)。

结论和相关性

这些发现表明,需要为模型开发人员制定一致的临床预测模型报告建议,以便共享模型部署所需的必要信息。这众多的已发表指南将总共需要报告 200 多项内容。来自 1 个供应商的模型文档报告了模型报告指南中最常要求的项目。但是,在报告与模型可靠性和公平性相关的项目方面,仍有改进的空间。这项分析为供应商提供了反馈,促使他们对未来用户更新了文档。

相似文献

1
Assessment of Adherence to Reporting Guidelines by Commonly Used Clinical Prediction Models From a Single Vendor: A Systematic Review.评估单一供应商常用临床预测模型报告指南的依从性:系统评价。
JAMA Netw Open. 2022 Aug 1;5(8):e2227779. doi: 10.1001/jamanetworkopen.2022.27779.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
4
Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Modeling Studies: Development and Validation.用于预后和诊断机器学习建模研究的综合报告指南:制定和验证。
J Med Internet Res. 2023 Aug 31;25:e48763. doi: 10.2196/48763.
5
Completeness of reporting of clinical prediction models developed using supervised machine learning: a systematic review.基于监督机器学习开发的临床预测模型报告的完整性:系统评价。
BMC Med Res Methodol. 2022 Jan 13;22(1):12. doi: 10.1186/s12874-021-01469-6.
6
A systematic review of the quality of clinical prediction models in in vitro fertilisation.体外受精中临床预测模型质量的系统评价。
Hum Reprod. 2020 Jan 1;35(1):100-116. doi: 10.1093/humrep/dez258.
7
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
8
Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.试验报告的统一标准(CONSORT)以及医学期刊上发表的随机对照试验(RCT)的报告完整性。
Cochrane Database Syst Rev. 2012 Nov 14;11(11):MR000030. doi: 10.1002/14651858.MR000030.pub2.
9
Reporting of prognostic clinical prediction models based on machine learning methods in oncology needs to be improved.基于机器学习方法的肿瘤预后临床预测模型报告需要改进。
J Clin Epidemiol. 2021 Oct;138:60-72. doi: 10.1016/j.jclinepi.2021.06.024. Epub 2021 Jun 29.
10
Reporting quality of European and Croatian health practice guidelines according to the RIGHT reporting checklist.报告欧洲和克罗地亚卫生实践指南的质量,依据 RIGHT 报告清单。
Implement Sci. 2018 Oct 29;13(1):135. doi: 10.1186/s13012-018-0828-4.

引用本文的文献

1
Validating 8 Area-Based Measures of Social Risk for Predicting Health and Mortality.验证8种基于区域的社会风险测量方法对健康和死亡率的预测能力。
JAMA Health Forum. 2025 Aug 1;6(8):e252669. doi: 10.1001/jamahealthforum.2025.2669.
2
Scoping review of deep learning research illuminates artificial intelligence chasm in otolaryngology-head and neck surgery.深度学习研究的范围综述揭示了耳鼻咽喉头颈外科领域人工智能的差距。
NPJ Digit Med. 2025 May 10;8(1):265. doi: 10.1038/s41746-025-01693-0.
3
AImedReport: A Prototype Tool to Facilitate Research Reporting and Translation of Artificial Intelligence Technologies in Health Care.《人工智能医学报告》:一个促进医疗保健领域人工智能技术研究报告与翻译的原型工具。
Mayo Clin Proc Digit Health. 2024 Apr 6;2(2):246-251. doi: 10.1016/j.mcpdig.2024.03.008. eCollection 2024 Jun.
4
Developing a Research Center for Artificial Intelligence in Medicine.建立一个医学人工智能研究中心。
Mayo Clin Proc Digit Health. 2024 Dec;2(4):677-686. doi: 10.1016/j.mcpdig.2024.07.005. Epub 2024 Oct 28.
5
Proximity to Practice: The Role of Technology in the Next Era of Assessment.贴近实践:技术在评估新时代中的作用。
Perspect Med Educ. 2024 Dec 26;13(1):646-653. doi: 10.5334/pme.1272. eCollection 2024.
6
Guidance for unbiased predictive information for healthcare decision-making and equity (GUIDE): considerations when race may be a prognostic factor.医疗保健决策与公平性的无偏预测信息指南(GUIDE):种族可能成为预后因素时的考量
NPJ Digit Med. 2024 Oct 19;7(1):290. doi: 10.1038/s41746-024-01245-y.
7
Availability of Evidence for Predictive Machine Learning Algorithms in Primary Care: A Systematic Review.预测机器学习算法在初级保健中的证据可用性:系统评价。
JAMA Netw Open. 2024 Sep 3;7(9):e2432990. doi: 10.1001/jamanetworkopen.2024.32990.
8
Development and Validation of a Deep Learning Model for Prediction of Adult Physiological Deterioration.深度学习模型预测成人生理衰退的开发与验证
Crit Care Explor. 2024 Sep 11;6(9):e1151. doi: 10.1097/CCE.0000000000001151. eCollection 2024 Sep 1.
9
Clinical Evaluation of Artificial Intelligence-Enabled Interventions.人工智能干预的临床评估。
Invest Ophthalmol Vis Sci. 2024 Aug 1;65(10):10. doi: 10.1167/iovs.65.10.10.
10
Strengthening the use of artificial intelligence within healthcare delivery organizations: balancing regulatory compliance and patient safety.加强医疗服务机构内部人工智能的使用:平衡监管合规和患者安全。
J Am Med Inform Assoc. 2024 Jun 20;31(7):1622-1627. doi: 10.1093/jamia/ocae119.

本文引用的文献

1
Improving Fairness in AI Models on Electronic Health Records: The Case for Federated Learning Methods.提高电子健康记录人工智能模型的公平性:联邦学习方法的案例
FAccT 23 (2023). 2023 Jun;2023:1599-1608. doi: 10.1145/3593013.3594102. Epub 2023 Jun 12.
2
External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients.在住院患者中验证广泛实施的专有脓毒症预测模型的外部有效性。
JAMA Intern Med. 2021 Aug 1;181(8):1065-1070. doi: 10.1001/jamainternmed.2021.2626.
3
Racial/Ethnic Disparities in the Performance of Prediction Models for Death by Suicide After Mental Health Visits.精神卫生就诊后自杀死亡预测模型表现的种族/民族差异。
JAMA Psychiatry. 2021 Jul 1;78(7):726-734. doi: 10.1001/jamapsychiatry.2021.0493.
4
Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.比较降低产后抑郁临床预测模型偏倚的方法。
JAMA Netw Open. 2021 Apr 1;4(4):e213909. doi: 10.1001/jamanetworkopen.2021.3909.
5
How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals.医学人工智能设备的评估方式:基于对美国食品药品监督管理局批准情况分析的局限性与建议
Nat Med. 2021 Apr;27(4):582-584. doi: 10.1038/s41591-021-01312-x.
6
CheXclusion: Fairness gaps in deep chest X-ray classifiers.CheXclusion:深度学习胸部 X 射线分类器中的公平性差距。
Pac Symp Biocomput. 2021;26:232-243.
7
DECIDE-AI: new reporting guidelines to bridge the development-to-implementation gap in clinical artificial intelligence.DECIDE-AI:弥合临床人工智能从研发到应用差距的新报告指南。
Nat Med. 2021 Feb;27(2):186-187. doi: 10.1038/s41591-021-01229-5.
8
A framework for making predictive models useful in practice.一个使预测模型在实践中有用的框架。
J Am Med Inform Assoc. 2021 Jun 12;28(6):1149-1158. doi: 10.1093/jamia/ocaa318.
9
Guiding better design and reporting of AI-intervention trials.指导人工智能干预试验的更好设计与报告。
Lancet Digit Health. 2020 Oct;2(10):e493. doi: 10.1016/S2589-7500(20)30223-5. Epub 2020 Sep 9.
10
Addressing bias in prediction models by improving subpopulation calibration.通过改进子群体校准来解决预测模型中的偏差。
J Am Med Inform Assoc. 2021 Mar 1;28(3):549-558. doi: 10.1093/jamia/ocaa283.