使用模型性能指标从其他模型中识别出适用于临床实践的潜在真实模型。

Identify the underlying true model from other models for clinical practice using model performance measures.

作者信息

Li Yan

机构信息

School of Mathematical Sciences, Xiamen University, Xiamen, 361005, People's Republic of China.

出版信息

BMC Med Res Methodol. 2025 Jan 9;25(1):4. doi: 10.1186/s12874-025-02457-w.

DOI:10.1186/s12874-025-02457-w

PMID:39789439

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11715858/

Abstract

OBJECTIVE

To assess whether the outcome generation true model could be identified from other candidate models for clinical practice with current conventional model performance measures considering various simulation scenarios and a CVD risk prediction as exemplar.

STUDY DESIGN AND SETTING

Thousands of scenarios of true models were used to simulate clinical data, various candidate models and true models were trained on training datasets and then compared on testing datasets with 25 conventional use model performance measures. This consists of univariate simulation (179.2k simulated datasets and over 1.792 million models), multivariate simulation (728k simulated datasets and over 8.736 million models) and a CVD risk prediction case analysis.

RESULTS

True models had overall C statistic and 95% range of 0.67 (0.51, 0.96) across all scenarios in univariate simulation, 0.81 (0.54, 0.98) in multivariate simulation, 0.85 (0.82, 0.88) in univariate case analysis and 0.85 (0.82, 0.88) in multivariate case analysis. Measures showed very clear differences between the true model and flip-coin model, little or none differences between the true model and candidate models with extra noises, relatively small differences between the true model and proxy models missing causal predictors.

CONCLUSION

The study found the true model is not always identified as the "outperformed" model by current conventional measures for binary outcome, even though such true model is presented in the clinical data. New statistical approaches or measures should be established to identify the casual true model from proxy models, especially for those in proxy models with extra noises and/or missing causal predictors.

摘要

目的

以心血管疾病（CVD）风险预测为例，考虑各种模拟场景，评估能否使用当前传统模型性能指标从其他候选模型中识别出临床实践中的结果生成真实模型。

研究设计与设置

使用数千种真实模型场景来模拟临床数据，在训练数据集上训练各种候选模型和真实模型，然后在测试数据集上使用25种传统使用的模型性能指标进行比较。这包括单变量模拟（17.92万个模拟数据集和超过179.2万个模型）、多变量模拟（72.8万个模拟数据集和超过873.6万个模型）以及CVD风险预测案例分析。

结果

在单变量模拟的所有场景中，真实模型的总体C统计量和95%范围为0.67（0.51，0.96），多变量模拟中为0.81（0.54，0.98），单变量案例分析中为0.85（0.82，0.88），多变量案例分析中为0.85（0.82，0.88）。这些指标显示真实模型与抛硬币模型之间存在非常明显的差异，真实模型与带有额外噪声的候选模型之间差异很小或没有差异，真实模型与缺少因果预测因子的替代模型之间差异相对较小。

结论

研究发现，即使临床数据中存在真实模型，当前用于二元结果的传统指标也不一定能将其识别为“表现最佳”的模型。应建立新的统计方法或指标，以便从替代模型中识别出因果真实模型，特别是对于那些带有额外噪声和/或缺少因果预测因子的替代模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fea7/11715858/3ffe1c7dd921/12874_2025_2457_Fig1_HTML.jpg

相似文献

Identify the underlying true model from other models for clinical practice using model performance measures.使用模型性能指标从其他模型中识别出适用于临床实践的潜在真实模型。

BMC Med Res Methodol. 2025 Jan 9;25(1):4. doi: 10.1186/s12874-025-02457-w.

The performance of prognostic models depended on the choice of missing value imputation algorithm: a simulation study.预后模型的性能取决于缺失值插补算法的选择：一项模拟研究。

J Clin Epidemiol. 2024 Dec;176:111539. doi: 10.1016/j.jclinepi.2024.111539. Epub 2024 Sep 24.

Imputation and Missing Indicators for Handling Missing Longitudinal Data: Data Simulation Analysis Based on Electronic Health Record Data.处理纵向缺失数据的插补与缺失指示符：基于电子健康记录数据的模拟分析

JMIR Med Inform. 2025 Mar 13;13:e64354. doi: 10.2196/64354.

Consistency of ranking was evaluated as new measure for prediction model stability: longitudinal cohort study.排名一致性作为新的预测模型稳定性度量指标的评估：纵向队列研究。

J Clin Epidemiol. 2021 Oct;138:168-177. doi: 10.1016/j.jclinepi.2021.06.026. Epub 2021 Jul 3.

An assessment of the relationship between clinical utility and predictive ability measures and the impact of mean risk in the population.对临床效用与预测能力指标之间的关系以及人群中平均风险的影响进行评估。

BMC Med Res Methodol. 2014 Jul 3;14:86. doi: 10.1186/1471-2288-14-86.

Performance of Firth-and logF-type penalized methods in risk prediction for small or sparse binary data.Firth 法和对数 F 型惩罚方法在小样本或稀疏二元数据风险预测中的性能

BMC Med Res Methodol. 2017 Feb 23;17(1):33. doi: 10.1186/s12874-017-0313-9.

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar.多种机器学习和统计模型在预测个体患者临床风险方面的一致性：以心血管疾病为例的纵向队列研究

BMJ. 2020 Nov 4;371:m3919. doi: 10.1136/bmj.m3919.

Understanding overfitting in random forest for probability estimation: a visualization and simulation study.理解随机森林在概率估计中的过拟合：可视化与模拟研究。

Diagn Progn Res. 2024 Sep 27;8(1):14. doi: 10.1186/s41512-024-00177-1.

Development and validation of QDiabetes-2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study.用于估计2型糖尿病未来风险的QDiabetes-2018风险预测算法的开发与验证：队列研究

BMJ. 2017 Nov 20;359:j5019. doi: 10.1136/bmj.j5019.

Multiple imputation for handling missing outcome data when estimating the relative risk.采用多重插补处理估计相对危险度时丢失的结局数据。

BMC Med Res Methodol. 2017 Sep 6;17(1):134. doi: 10.1186/s12874-017-0414-5.

本文引用的文献

Dendrogram of transparent feature importance machine learning statistics to classify associations for heart failure: A reanalysis of a retrospective cohort study of the Medical Information Mart for Intensive Care III (MIMIC-III) database.基于机器学习的透明特征重要性树状图对心力衰竭关联进行分类：对重症监护信息集市 III （MIMIC-III）数据库回顾性队列研究的重新分析。

PLoS One. 2023 Jul 20;18(7):e0288819. doi: 10.1371/journal.pone.0288819. eCollection 2023.

Quantification of the Effect of Vitamin E Intake on Depressive Symptoms in United States Adults Using Restricted Cubic Splines.使用受限立方样条法对美国成年人维生素E摄入量对抑郁症状的影响进行量化分析。

Curr Dev Nutr. 2023 Feb 2;7(2):100038. doi: 10.1016/j.cdnut.2023.100038. eCollection 2023 Feb.

Computation of the distribution of model accuracy statistics in machine learning: Comparison between analytically derived distributions and simulation-based methods.机器学习中模型准确性统计分布的计算：解析推导分布与基于模拟方法的比较。

Health Sci Rep. 2023 Apr 20;6(4):e1214. doi: 10.1002/hsr2.1214. eCollection 2023 Apr.

Hospitalized COVID-19 patients with diabetes have an increased risk for pneumonia, intensive care unit requirement, intubation, and death: A cross-sectional cohort study in Mexico in 2020.2020年在墨西哥开展的一项横断面队列研究表明，患有糖尿病的新冠肺炎住院患者发生肺炎、需要重症监护、插管和死亡的风险增加。

Health Sci Rep. 2023 Apr 18;6(4):e1222. doi: 10.1002/hsr2.1222. eCollection 2023 Apr.

Use of machine learning to identify risk factors for coronary artery disease.利用机器学习识别冠心病的危险因素。

PLoS One. 2023 Apr 14;18(4):e0284103. doi: 10.1371/journal.pone.0284103. eCollection 2023.

Increasing transparency in machine learning through bootstrap simulation and shapely additive explanations.通过引导模拟和 Shapley 加性解释提高机器学习的透明度。

PLoS One. 2023 Feb 23;18(2):e0281922. doi: 10.1371/journal.pone.0281922. eCollection 2023.

J Clin Epidemiol. 2021 Oct;138:168-177. doi: 10.1016/j.jclinepi.2021.06.026. Epub 2021 Jul 3.

BMJ. 2020 Nov 4;371:m3919. doi: 10.1136/bmj.m3919.

An artificial neural network approach for predicting hypertension using NHANES data.使用 NHANES 数据的人工神经网络预测高血压方法。

Sci Rep. 2020 Jun 30;10(1):10620. doi: 10.1038/s41598-020-67640-z.

Examining the impact of data quality and completeness of electronic health records on predictions of patients' risks of cardiovascular disease.检查电子健康记录的数据质量和完整性对预测患者心血管疾病风险的影响。

Int J Med Inform. 2020 Jan;133:104033. doi: 10.1016/j.ijmedinf.2019.104033. Epub 2019 Nov 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用模型性能指标从其他模型中识别出适用于临床实践的潜在真实模型。

Identify the underlying true model from other models for clinical practice using model performance measures.

作者信息

机构信息

出版信息

OBJECTIVE

STUDY DESIGN AND SETTING

RESULTS

CONCLUSION

目的

研究设计与设置

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献