• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

提高机器学习算法在多中心队列中对健康结果预测的性能。

Improving the performance of machine learning algorithms for health outcomes predictions in multicentric cohorts.

机构信息

School of Public Health, University of São Paulo, São Paulo, SP, Brazil.

Brazilian Institute of Education, Development and Research-IDP, Economics Graduate Program, Brasilia, DF, Brazil.

出版信息

Sci Rep. 2023 Jan 19;13(1):1022. doi: 10.1038/s41598-022-26467-6.

DOI:10.1038/s41598-022-26467-6
PMID:36658181
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9849836/
Abstract

Machine learning algorithms are being increasingly used in healthcare settings but their generalizability between different regions is still unknown. This study aims to identify the strategy that maximizes the predictive performance of identifying the risk of death by COVID-19 in different regions of a large and unequal country. This is a multicenter cohort study with data collected from patients with a positive RT-PCR test for COVID-19 from March to August 2020 (n = 8477) in 18 hospitals, covering all five Brazilian regions. Of all patients with a positive RT-PCR test during the period, 2356 (28%) died. Eight different strategies were used for training and evaluating the performance of three popular machine learning algorithms (extreme gradient boosting, lightGBM, and catboost). The strategies ranged from only using training data from a single hospital, up to aggregating patients by their geographic regions. The predictive performance of the algorithms was evaluated by the area under the ROC curve (AUROC) on the test set of each hospital. We found that the best overall predictive performances were obtained when using training data from the same hospital, which was the winning strategy for 11 (61%) of the 18 participating hospitals. In this study, the use of more patient data from other regions slightly decreased predictive performance. However, models trained in other hospitals still had acceptable performances and could be a solution while data for a specific hospital is being collected.

摘要

机器学习算法在医疗保健领域的应用越来越广泛,但它们在不同地区的泛化能力仍不清楚。本研究旨在确定一种策略,该策略可以最大限度地提高识别 COVID-19 死亡风险的预测性能,研究对象为来自巴西 18 家医院的 COVID-19 阳性 RT-PCR 检测患者,数据收集时间为 2020 年 3 月至 8 月(n=8477),涵盖了巴西的所有五个地区。在此期间所有 COVID-19 阳性 RT-PCR 检测患者中,有 2356 人(28%)死亡。本研究使用了 8 种不同的策略来训练和评估 3 种流行的机器学习算法(极端梯度提升、lightGBM 和 catboost)的性能。这些策略从仅使用单个医院的训练数据到按地理位置聚合患者不等。算法的预测性能通过各医院测试集的 ROC 曲线下面积(AUROC)来评估。研究发现,使用来自同一医院的训练数据可获得最佳的整体预测性能,在 18 家参与医院中,有 11 家(61%)医院采用的是这种策略。在本研究中,使用来自其他地区的更多患者数据会略微降低预测性能。然而,在其他医院训练的模型仍具有可接受的性能,在特定医院的数据收集期间,可以作为解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/738d8bf10514/41598_2022_26467_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/3324001f6550/41598_2022_26467_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/44524349d38d/41598_2022_26467_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/738d8bf10514/41598_2022_26467_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/3324001f6550/41598_2022_26467_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/44524349d38d/41598_2022_26467_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e972/9852456/738d8bf10514/41598_2022_26467_Fig3_HTML.jpg

相似文献

1
Improving the performance of machine learning algorithms for health outcomes predictions in multicentric cohorts.提高机器学习算法在多中心队列中对健康结果预测的性能。
Sci Rep. 2023 Jan 19;13(1):1022. doi: 10.1038/s41598-022-26467-6.
2
A multipurpose machine learning approach to predict COVID-19 negative prognosis in São Paulo, Brazil.一种多用途机器学习方法,用于预测巴西圣保罗的 COVID-19 不良预后。
Sci Rep. 2021 Feb 8;11(1):3343. doi: 10.1038/s41598-021-82885-y.
3
Predictive modeling for 14-day unplanned hospital readmission risk by using machine learning algorithms.基于机器学习算法的 14 天内非计划性住院再入院风险预测模型。
BMC Med Inform Decis Mak. 2021 Oct 20;21(1):288. doi: 10.1186/s12911-021-01639-y.
4
Explainable Machine Learning Techniques To Predict Amiodarone-Induced Thyroid Dysfunction Risk: Multicenter, Retrospective Study With External Validation.可解释机器学习技术预测胺碘酮诱导甲状腺功能障碍风险:多中心回顾性研究及外部验证。
J Med Internet Res. 2023 Feb 7;25:e43734. doi: 10.2196/43734.
5
The Development and Validation of Simplified Machine Learning Algorithms to Predict Prognosis of Hospitalized Patients With COVID-19: Multicenter, Retrospective Study.中文译文:简化机器学习算法预测 COVID-19 住院患者预后的开发和验证:多中心回顾性研究。
J Med Internet Res. 2022 Jan 21;24(1):e31549. doi: 10.2196/31549.
6
Clinical Features of Emergency Department Patients from Early COVID-19 Pandemic that Predict SARS-CoV-2 Infection: Machine-learning Approach.早期新冠疫情期间急诊科患者的临床特征预测 SARS-CoV-2 感染:机器学习方法。
West J Emerg Med. 2021 Mar 4;22(2):244-251. doi: 10.5811/westjem.2020.12.49370.
7
Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study.利用常规血液检测排除急诊科成人COVID-19的机器学习工具的开发与外部验证:一项大型、多中心、真实世界研究
J Med Internet Res. 2020 Dec 2;22(12):e24048. doi: 10.2196/24048.
8
Recurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data.用于预测COVID-19患者入院时预后的循环神经网络模型(CovRNN):使用电子健康记录数据进行模型开发和验证
Lancet Digit Health. 2022 Jun;4(6):e415-e425. doi: 10.1016/S2589-7500(22)00049-8. Epub 2022 Apr 21.
9
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
10
Machine learning algorithms for early sepsis detection in the emergency department: A retrospective study.机器学习算法在急诊科早期脓毒症检测中的应用:一项回顾性研究。
Int J Med Inform. 2022 Apr;160:104689. doi: 10.1016/j.ijmedinf.2022.104689. Epub 2022 Jan 20.

引用本文的文献

1
Prediction of long-term recurrence-free and overall survival in early-onset colorectal cancer: the ENCORE multi-centre study.早发性结直肠癌长期无复发生存率和总生存率的预测:ENCORE多中心研究
NPJ Precis Oncol. 2025 Jun 21;9(1):202. doi: 10.1038/s41698-025-00978-7.
2
Multicenter comparative analysis of local and aggregated data training strategies in COVID-19 outcome prediction with Machine learning.机器学习在新冠疫情结局预测中局部和汇总数据训练策略的多中心比较分析
PLOS Digit Health. 2024 Dec 26;3(12):e0000699. doi: 10.1371/journal.pdig.0000699. eCollection 2024 Dec.
3
Efficient clinical decision-making process via AI-based multimodal data fusion: A COVID-19 case study.

本文引用的文献

1
A hybrid machine learning/deep learning COVID-19 severity predictive model from CT images and clinical data.基于 CT 图像和临床数据的机器学习/深度学习 COVID-19 严重程度预测模型的混合模型。
Sci Rep. 2022 Mar 14;12(1):4329. doi: 10.1038/s41598-022-07890-1.
2
Real-world evaluation of rapid and laboratory-free COVID-19 triage for emergency care: external validation and pilot deployment of artificial intelligence driven screening.真实世界环境下的 COVID-19 快速、无实验室分诊用于紧急护理的评估:人工智能驱动的筛查的外部验证和试点部署。
Lancet Digit Health. 2022 Apr;4(4):e266-e278. doi: 10.1016/S2589-7500(21)00272-7. Epub 2022 Mar 9.
3
基于人工智能的多模态数据融合实现高效临床决策过程:一项新冠肺炎案例研究。
Heliyon. 2024 Oct 10;10(20):e38642. doi: 10.1016/j.heliyon.2024.e38642. eCollection 2024 Oct 30.
4
Liquid biopsy to identify Barrett's oesophagus, dysplasia and oesophageal adenocarcinoma: the multicentre study.液体活检用于识别巴雷特食管、发育异常和食管腺癌:多中心研究
Gut. 2025 Jan 17;74(2):169-181. doi: 10.1136/gutjnl-2024-333364.
5
ClotCatcher: a novel natural language model to accurately adjudicate venous thromboembolism from radiology reports.ClotCatcher:一种新颖的自然语言模型,可准确从放射学报告中判断静脉血栓栓塞。
BMC Med Inform Decis Mak. 2023 Nov 16;23(1):262. doi: 10.1186/s12911-023-02369-z.
Early identification of patients admitted to hospital for covid-19 at risk of clinical deterioration: model development and multisite external validation study.
因 COVID-19 住院的患者临床恶化风险的早期识别:模型的建立与多中心外部验证研究。
BMJ. 2022 Feb 17;376:e068576. doi: 10.1136/bmj-2021-068576.
4
A machine-learning parsimonious multivariable predictive model of mortality risk in patients with Covid-19.一种机器学习的简洁多变量预测模型,用于预测新冠病毒患者的死亡率风险。
Sci Rep. 2021 Oct 27;11(1):21136. doi: 10.1038/s41598-021-99905-6.
5
Federated learning for predicting clinical outcomes in patients with COVID-19.基于联邦学习的 COVID-19 患者临床结局预测
Nat Med. 2021 Oct;27(10):1735-1743. doi: 10.1038/s41591-021-01506-3. Epub 2021 Sep 15.
6
Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study.利用自我报告症状在英国早期检测 COVID-19:一项大规模、前瞻性、流行病学监测研究。
Lancet Digit Health. 2021 Sep;3(9):e587-e598. doi: 10.1016/S2589-7500(21)00131-X. Epub 2021 Jul 29.
7
External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients.在住院患者中验证广泛实施的专有脓毒症预测模型的外部有效性。
JAMA Intern Med. 2021 Aug 1;181(8):1065-1070. doi: 10.1001/jamainternmed.2021.2626.
8
COVID-19 detection using federated machine learning.使用联邦机器学习进行 COVID-19 检测。
PLoS One. 2021 Jun 8;16(6):e0252573. doi: 10.1371/journal.pone.0252573. eCollection 2021.
9
Federated deep learning for detecting COVID-19 lung abnormalities in CT: a privacy-preserving multinational validation study.用于检测CT中COVID-19肺部异常的联邦深度学习:一项隐私保护的多国验证研究。
NPJ Digit Med. 2021 Mar 29;4(1):60. doi: 10.1038/s41746-021-00431-6.
10
A multipurpose machine learning approach to predict COVID-19 negative prognosis in São Paulo, Brazil.一种多用途机器学习方法,用于预测巴西圣保罗的 COVID-19 不良预后。
Sci Rep. 2021 Feb 8;11(1):3343. doi: 10.1038/s41598-021-82885-y.