比较机器学习模型对有序结果的预测辨别力：急性腹泻患者脱水预测的案例研究。

Comparing the predictive discrimination of machine learning models for ordinal outcomes: A case study of dehydration prediction in patients with acute diarrhea.

作者信息

Qu Kexin, Gainey Monique, Kanekar Samika S, Nasrim Sabiha, Nelson Eric J, Garbern Stephanie C, Monjory Mahmuda, Alam Nur H, Levine Adam C, Schmid Christopher H

机构信息

Department of Biostatistics, Brown University, Providence, Rhode Island, United States of America.

Department of Emergency Medicine, Rhode Island Hospital, Providence, Rhode Island, United States of America.

出版信息

PLOS Digit Health. 2025 May 6;4(5):e0000820. doi: 10.1371/journal.pdig.0000820. eCollection 2025 May.

DOI:10.1371/journal.pdig.0000820

PMID:40327713

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12054866/

Abstract

Many comparisons of statistical regression and machine learning algorithms to build clinical predictive models use inadequate methods to build regression models and do not have proper independent test sets on which to externally validate the models. Proper comparisons for models of ordinal categorical outcomes do not exist. We set out to compare model discrimination for four regression and machine learning methods in a case study predicting the ordinal outcome of severe, some, or no dehydration among patients with acute diarrhea presenting to a large medical center in Bangladesh using data from the NIRUDAK study derivation and validation cohorts. Proportional Odds Logistic Regression (POLR), penalized ordinal regression (RIDGE), classification trees (CART), and random forest (RF) models were built to predict dehydration severity and compared using three ordinal discrimination indices: ordinal c-index (ORC), generalized c-index (GC), and average dichotomous c-index (ADC). Performance was evaluated on models developed on the training data, on the same models applied to an external test set and through internal validation with three bootstrap algorithms to correct for overoptimism. RF had superior discrimination on the original training data set, but its performance was more similar to the other three methods after internal validation using the bootstrap. Performance for all models was lower on the prospective test dataset, with particularly large reduction for RF and RIDGE. POLR had the best performance in the test dataset and was also most efficient, with the smallest final model size. Clinical prediction models for ordinal outcomes, just like those for binary and continuous outcomes, need to be prospectively validated on external test sets if possible because internal validation may give a too optimistic picture of model performance. Regression methods can perform as well as more automated machine learning methods if constructed with attention to potential nonlinear associations. Because regression models are often more interpretable clinically, their use should be encouraged.

摘要

许多用于构建临床预测模型的统计回归和机器学习算法比较，在构建回归模型时采用的方法并不充分，且没有合适的独立测试集来对模型进行外部验证。对于有序分类结果模型，不存在恰当的比较方法。我们开展了一项案例研究，旨在比较四种回归和机器学习方法的模型判别能力，该研究使用来自NIRUDAK研究推导和验证队列的数据，预测在孟加拉国一家大型医疗中心就诊的急性腹泻患者出现严重、部分或无脱水的有序结果。构建了比例优势逻辑回归（POLR）、惩罚有序回归（RIDGE）、分类树（CART）和随机森林（RF）模型来预测脱水严重程度，并使用三个有序判别指数进行比较：有序c指数（ORC）、广义c指数（GC）和平均二分c指数（ADC）。在基于训练数据开发的模型上、应用于外部测试集的相同模型上以及通过三种自助法算法进行内部验证以校正过度乐观的情况下，对性能进行了评估。RF在原始训练数据集上具有卓越的判别能力，但在使用自助法进行内部验证后，其性能与其他三种方法更为相似。在前瞻性测试数据集上，所有模型的性能均较低，RF和RIDGE的下降尤为明显。POLR在测试数据集中表现最佳，效率也最高，最终模型规模最小。与二元和连续结果的临床预测模型一样，有序结果的临床预测模型如果可能的话，需要在外部测试集上进行前瞻性验证，因为内部验证可能会对模型性能给出过于乐观的描述。如果在构建回归方法时关注潜在的非线性关联，其性能可以与更自动化的机器学习方法相媲美。由于回归模型在临床上通常更具可解释性，因此应鼓励使用回归模型。

相似文献

Comparing the predictive discrimination of machine learning models for ordinal outcomes: A case study of dehydration prediction in patients with acute diarrhea.比较机器学习模型对有序结果的预测辨别力：急性腹泻患者脱水预测的案例研究。

PLOS Digit Health. 2025 May 6;4(5):e0000820. doi: 10.1371/journal.pdig.0000820. eCollection 2025 May.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Sexual Harassment and Prevention Training性骚扰与预防培训

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗？

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施：系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。

Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能？开发一种互联网应用算法。

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

Risk of thromboembolism in patients with COVID-19 who are using hormonal contraception.COVID-19 患者使用激素避孕的血栓栓塞风险。

Cochrane Database Syst Rev. 2023 Jan 9;1(1):CD014908. doi: 10.1002/14651858.CD014908.pub2.

本文引用的文献

A comparison of the NIRUDAK models and WHO algorithm for dehydration assessment in older children and adults with acute diarrhoea: a prospective, observational study.NIRUDAK 模型与世界卫生组织算法在评估急性腹泻的大龄儿童和成人脱水中的比较：一项前瞻性、观察性研究。

Lancet Glob Health. 2023 Nov;11(11):e1725-e1733. doi: 10.1016/S2214-109X(23)00403-5. Epub 2023 Sep 27.

Utilization of machine learning methods for predicting surgical outcomes after total knee arthroplasty.利用机器学习方法预测全膝关节置换术后的手术结果。

PLoS One. 2022 Mar 22;17(3):e0263897. doi: 10.1371/journal.pone.0263897. eCollection 2022.

Regularized Ordinal Regression and the ordinalNet R Package.正则化有序回归与ordinalNet R包。

J Stat Softw. 2021 Sep;99(6). doi: 10.18637/jss.v099.i06.

Continuous diagnostic models for volume deficit in patients with acute diarrhea.急性腹泻患者容量不足的连续诊断模型。

Trop Med Health. 2021 Sep 6;49(1):70. doi: 10.1186/s41182-021-00361-9.

Predictive Risk Models for Wound Infection-Related Hospitalization or ED Visits in Home Health Care Using Machine-Learning Algorithms.利用机器学习算法预测家庭医疗保健中与伤口感染相关的住院或急诊就诊的风险模型。

Adv Skin Wound Care. 2021 Aug 1;34(8):1-12. doi: 10.1097/01.ASW.0000755928.30524.22.

Predictive performance of machine and statistical learning methods: Impact of data-generating processes on external validity in the "large N, small p" setting.机器学习和统计学习方法的预测性能：在“大数据量、小样本量”设置下，数据生成过程对外部有效性的影响。

Stat Methods Med Res. 2021 Jun;30(6):1465-1483. doi: 10.1177/09622802211002867. Epub 2021 Apr 13.

Derivation of the first clinical diagnostic models for dehydration severity in patients over five years with acute diarrhea.五岁以上急性腹泻患者脱水严重程度的首个临床诊断模型的推导

PLoS Negl Trop Dis. 2021 Mar 10;15(3):e0009266. doi: 10.1371/journal.pntd.0009266. eCollection 2021 Mar.

Machine learning vs. conventional statistical models for predicting heart failure readmission and mortality.用于预测心力衰竭再入院和死亡率的机器学习与传统统计模型对比

ESC Heart Fail. 2021 Feb;8(1):106-115. doi: 10.1002/ehf2.13073. Epub 2020 Nov 17.

Global burden of 369 diseases and injuries in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019.204 个国家和地区 1990-2019 年 369 种疾病和伤害导致的全球负担：2019 年全球疾病负担研究的系统分析。

Lancet. 2020 Oct 17;396(10258):1204-1222. doi: 10.1016/S0140-6736(20)30925-9.

Using machine-learning risk prediction models to triage the acuity of undifferentiated patients entering the emergency care system: a systematic review.使用机器学习风险预测模型对进入急诊护理系统的未分化患者的 acuity 进行分诊：一项系统综述。（注：这里“acuity”在医学语境中可能有“ acuity of illness 病情严重程度”等含义，具体需结合上下文准确理解，但按照要求不添加解释。）

Diagn Progn Res. 2020 Oct 2;4:16. doi: 10.1186/s41512-020-00084-1. eCollection 2020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验