Suppr超能文献

机器学习算法在临床事件预测(冠心病风险)中的比较。

Comparison of machine learning algorithms for clinical event prediction (risk of coronary heart disease).

机构信息

Machine Learning Health Working Group, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Madrid, Spain; Department of Medicine, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Madrid, Spain.

Machine Learning Health Working Group, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Madrid, Spain; Department of Computer Science and Technology, School of Architecture, Engineering and Design, Universidad Europea de Madrid, Madrid, Spain.

出版信息

J Biomed Inform. 2019 Sep;97:103257. doi: 10.1016/j.jbi.2019.103257. Epub 2019 Jul 30.

Abstract

AIM

The aim of this study is to compare the utility of several supervised machine learning (ML) algorithms for predicting clinical events in terms of their internal validity and accuracy. The results, which were obtained using two statistical software platforms, were also compared.

MATERIALS AND METHODS

The data used in this research come from the open database of the Framingham Heart Study, which originated in 1948 in Framingham, Massachusetts as a prospective study of risk factors for cardiovascular disease. Through data mining processes, three data models were elaborated and a comparative methodological study between the different ML algorithms - decision tree, random forest, support vector machines, neural networks, and logistic regression - was carried out. The global selection criterium for choosing the right set of hyperparameters and the type of data manipulation was the area under a curve (AUC). The software tools used to analyze the data were R-Studio® and RapidMiner®.

RESULTS

The Framingham study open database contains 4240 observations. The algorithm that yielded the greatest AUC when analyzing the data in R-Studio was neural network applied to a model that excluded all observations in which there was at least one missing value (AUC = 0.71); when analyzing the data in RapidMiner and applying the same model, the best algorithm was support vector machines (AUC = 0.75).

CONCLUSIONS

ML algorithms can reinforce the diagnostic and prognostic capacity of traditional regression techniques. Differences between the applicability of those algorithms and the results obtained with them were a function of the software platforms used in the data analysis.

摘要

目的

本研究旨在比较几种监督机器学习(ML)算法在内部有效性和准确性方面预测临床事件的能力。还比较了使用两个统计软件平台获得的结果。

材料与方法

本研究使用的数据来自弗雷明汉心脏研究(Framingham Heart Study)的开放数据库,该数据库起源于 1948 年马萨诸塞州弗雷明汉的一项心血管疾病风险因素的前瞻性研究。通过数据挖掘过程,详细阐述了三个数据模型,并对不同 ML 算法(决策树、随机森林、支持向量机、神经网络和逻辑回归)之间的比较方法学研究进行了研究。选择正确的超参数集和数据处理类型的全局选择标准是曲线下面积(AUC)。用于分析数据的软件工具是 R-Studio®和 RapidMiner®。

结果

弗雷明汉研究开放数据库包含 4240 个观测值。在 R-Studio 中分析数据时,AUC 最高的算法是神经网络应用于排除至少有一个缺失值的所有观测值的模型(AUC=0.71);在 RapidMiner 中分析数据并应用相同的模型时,最佳算法是支持向量机(AUC=0.75)。

结论

机器学习算法可以增强传统回归技术的诊断和预后能力。这些算法的适用性差异及其产生的结果是数据分析中使用的软件平台的函数。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验