Department of Epidemiology and Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad 917791-8564, Iran.
International UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad 917791-8564, Iran.
Int J Environ Res Public Health. 2020 Sep 4;17(18):6449. doi: 10.3390/ijerph17186449.
(1) Background: Coronary angiography is considered to be the most reliable method for the diagnosis of cardiovascular disease. However, angiography is an invasive procedure that carries a risk of complications; hence, it would be preferable for an appropriate method to be applied to determine the necessity for angiography. The objective of this study was to compare support vector machine, naïve Bayes and logistic regressions to determine the diagnostic factors that can predict the need for coronary angiography. These models are machine learning algorithms. Machine learning is considered to be a branch of artificial intelligence. Its aims are to design and develop algorithms that allow computers to improve their performance on data analysis and decision making. The process involves the analysis of past experiences to find practical and helpful regularities and patterns, which may also be overlooked by a human. (2) Materials and Methods: This cross-sectional study was performed on 1187 candidates for angiography referred to Ghaem Hospital, Mashhad, Iran from 2011 to 2012. A logistic regression, naive Bayes and support vector machine were applied to determine whether they could predict the results of angiography. Afterwards, the sensitivity, specificity, positive and negative predictive values, AUC (area under the curve) and accuracy of all three models were computed in order to compare them. All analyses were performed using R 3.4.3 software (R Core Team; Auckland, New Zealand) with the help of other software packages including receiver operating characteristic (ROC), caret, e1071 and rminer. (3) Results: The area under the curve for logistic regression, naïve Bayes and support vector machine were similar-0.76, 0.74 and 0.75, respectively. Thus, in terms of the model parsimony and simplicity of application, the naïve Bayes model with three variables had the best performance in comparison with the logistic regression model with seven variables and support vector machine with six variables. (4) Conclusions: Gender, age and fasting blood glucose (FBG) were found to be the most important factors to predict the result of coronary angiography. The naïve Bayes model performed well using these three variables alone, and they are considered important variables for the other two models as well. According to an acceptable prediction of the models, they can be used as pragmatic, cost-effective and valuable methods that support physicians in decision making.
(1) 背景:冠状动脉造影被认为是诊断心血管疾病最可靠的方法。然而,血管造影是一种有创的程序,有发生并发症的风险;因此,应用适当的方法来确定血管造影的必要性是很重要的。本研究的目的是比较支持向量机、朴素贝叶斯和逻辑回归,以确定可以预测冠状动脉造影需要的诊断因素。这些模型是机器学习算法。机器学习被认为是人工智能的一个分支。其目的是设计和开发算法,使计算机能够在数据分析和决策方面提高性能。该过程涉及分析过去的经验,以发现实际和有用的规律和模式,而这些可能被人类忽略。
(2) 材料和方法:这项横断面研究于 2011 年至 2012 年在伊朗马什哈德的盖姆医院对 1187 名接受血管造影的患者进行。应用逻辑回归、朴素贝叶斯和支持向量机来确定它们是否可以预测血管造影的结果。然后,计算了所有三个模型的敏感性、特异性、阳性和阴性预测值、AUC(曲线下面积)和准确性,以便进行比较。所有分析均使用 R 3.4.3 软件(R 核心团队;奥克兰,新西兰)进行,并借助其他软件包,包括接收者操作特征(ROC)、caret、e1071 和 rminer。
(3) 结果:逻辑回归、朴素贝叶斯和支持向量机的曲线下面积相似,分别为 0.76、0.74 和 0.75。因此,就模型简约性和应用简单性而言,与具有七个变量的逻辑回归模型和具有六个变量的支持向量机模型相比,具有三个变量的朴素贝叶斯模型表现最好。
(4) 结论:性别、年龄和空腹血糖(FBG)是预测冠状动脉造影结果的最重要因素。朴素贝叶斯模型仅使用这三个变量就能很好地发挥作用,并且它们被认为是其他两个模型的重要变量。根据模型的可接受预测,它们可以作为实用、具有成本效益和有价值的方法,支持医生做出决策。