基于机器学习的临床预测建模基础：第四部分——二分类问题的实用方法。

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

机构信息

Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland.

Neurosurgical Artificial Intelligence Laboratory Aachen (NAILA), Department of Neurosurgery, RWTH Aachen University Hospital, Aachen, Germany.

出版信息

Acta Neurochir Suppl. 2022;134:33-41. doi: 10.1007/978-3-030-85292-4_5.

DOI:10.1007/978-3-030-85292-4_5

PMID:34862525

Abstract

We illustrate the steps required to train and validate a simple, machine learning-based clinical prediction model for any binary outcome, such as, for example, the occurrence of a complication, in the statistical programming language R. To illustrate the methods applied, we supply a simulated database of 10,000 glioblastoma patients who underwent microsurgery, and predict the occurrence of 12-month survival. We walk the reader through each step, including import, checking, and splitting of datasets. In terms of pre-processing, we focus on how to practically implement imputation using a k-nearest neighbor algorithm, and how to perform feature selection using recursive feature elimination. When it comes to training models, we apply the theory discussed in Parts I-III. We show how to implement bootstrapping and to evaluate and select models based on out-of-sample error. Specifically for classification, we discuss how to counteract class imbalance by using upsampling techniques. We discuss how the reporting of a minimum of accuracy, area under the curve (AUC), sensitivity, and specificity for discrimination, as well as slope and intercept for calibration-if possible alongside a calibration plot-is paramount. Finally, we explain how to arrive at a measure of variable importance using a universal, AUC-based method. We provide the full, structured code, as well as the complete glioblastoma survival database for the readers to download and execute in parallel to this section.

摘要

我们将在统计编程语言 R 中展示训练和验证简单的基于机器学习的临床预测模型所需的步骤，该模型用于预测任何二分类结果，例如并发症的发生。为了说明应用的方法，我们提供了一个模拟的 10000 名接受微创手术的胶质母细胞瘤患者的数据库，并预测 12 个月的生存率。我们引导读者完成每个步骤，包括数据集的导入、检查和分割。在预处理方面，我们重点介绍如何使用 k-最近邻算法实际实现插补，以及如何使用递归特征消除进行特征选择。在训练模型方面，我们应用了第一至第三部分中讨论的理论。我们展示了如何实现自举，并根据样本外误差评估和选择模型。特别是对于分类，我们讨论了如何通过使用上采样技术来克服类别不平衡。我们讨论了报告最低的准确性、曲线下面积 (AUC)、敏感性和特异性用于区分，以及斜率和截距用于校准（如果可能的话，同时提供校准图）是至关重要的。最后，我们解释了如何使用基于 AUC 的通用方法来获得变量重要性的度量。我们为读者提供了完整的、结构化的代码，以及完整的胶质母细胞瘤生存数据库，以便读者可以下载并与本节内容并行执行。

相似文献

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

Acta Neurochir Suppl. 2022;134:33-41. doi: 10.1007/978-3-030-85292-4_5.

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part V-A Practical Approach to Regression Problems.

Acta Neurochir Suppl. 2022;134:43-50. doi: 10.1007/978-3-030-85292-4_6.

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.

Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part III-Model Evaluation and Other Points of Significance.

Acta Neurochir Suppl. 2022;134:23-31. doi: 10.1007/978-3-030-85292-4_4.

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part II-Generalization and Overfitting.

Acta Neurochir Suppl. 2022;134:15-21. doi: 10.1007/978-3-030-85292-4_3.

Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.

BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.

Foundations of Feature Selection in Clinical Prediction Modeling.

Acta Neurochir Suppl. 2022;134:51-57. doi: 10.1007/978-3-030-85292-4_7.

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part I-Introduction and General Principles.

Acta Neurochir Suppl. 2022;134:7-13. doi: 10.1007/978-3-030-85292-4_2.

Cardiovascular disease incidence prediction by machine learning and statistical techniques: a 16-year cohort study from eastern Mediterranean region.

BMC Med Inform Decis Mak. 2023 Apr 19;23(1):72. doi: 10.1186/s12911-023-02169-5.

How Does the Skeletal Oncology Research Group Algorithm's Prediction of 5-year Survival in Patients with Chondrosarcoma Perform on International Validation?

Clin Orthop Relat Res. 2020 Oct;478(10):2300-2308. doi: 10.1097/CORR.0000000000001305.

本文引用的文献

Letter to the Editor Regarding "Investigating Risk Factors and Predicting Complications in Deep Brain Stimulation Surgery with Machine Learning Algorithms".

World Neurosurg. 2020 May;137:496. doi: 10.1016/j.wneu.2020.01.189.

Deep Learning AI Applications in the Imaging of Glioma.

Top Magn Reson Imaging. 2020 Apr;29(2):115-0. doi: 10.1097/RMR.0000000000000237.

Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury.

J Clin Epidemiol. 2020 Jun;122:95-107. doi: 10.1016/j.jclinepi.2020.03.005. Epub 2020 Mar 20.

Machine learning for semi-automated classification of glioblastoma, brain metastasis and central nervous system lymphoma using magnetic resonance advanced imaging.

Ann Transl Med. 2019 Jun;7(11):232. doi: 10.21037/atm.2018.08.05.

Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports.

JCO Clin Cancer Inform. 2019 Apr;3:1-9. doi: 10.1200/CCI.18.00138.

Machine Learning in Medicine.

N Engl J Med. 2019 Apr 4;380(14):1347-1358. doi: 10.1056/NEJMra1814259.

Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches.

PLoS One. 2019 Mar 27;14(3):e0214365. doi: 10.1371/journal.pone.0214365. eCollection 2019.

Subspecialization within default mode nodes characterized in 10,000 UK Biobank participants.

Proc Natl Acad Sci U S A. 2018 Nov 27;115(48):12295-12300. doi: 10.1073/pnas.1804876115. Epub 2018 Nov 12.

Letter to the Editor. Class imbalance in machine learning for neurosurgical outcome prediction: are our models valid?

J Neurosurg Spine. 2018 Nov 1;29(5):611-612. doi: 10.3171/2018.5.SPINE18543. Epub 2018 Aug 17.

Automated deep-neural-network surveillance of cranial images for acute neurologic events.

Nat Med. 2018 Sep;24(9):1337-1341. doi: 10.1038/s41591-018-0147-y. Epub 2018 Aug 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于机器学习的临床预测建模基础：第四部分——二分类问题的实用方法。

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

机构信息

Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland.

Neurosurgical Artificial Intelligence Laboratory Aachen (NAILA), Department of Neurosurgery, RWTH Aachen University Hospital, Aachen, Germany.

出版信息

Acta Neurochir Suppl. 2022;134:33-41. doi: 10.1007/978-3-030-85292-4_5.

DOI:10.1007/978-3-030-85292-4_5

PMID:34862525

Abstract

摘要

基于机器学习的临床预测建模基础：第四部分——二分类问题的实用方法。

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于机器学习的临床预测建模基础：第四部分——二分类问题的实用方法。

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

机构信息

出版信息

相似文献

本文引用的文献