Department of Mathematics and Computer Science, Physical Sciences and Earth Sciences, University of Messina, Messina, Italy.
Institute of Clinical Physiology - Reggio Calabria Unit, Laboratory of Bioinformatics, National Research Council, Italy.
Comput Methods Programs Biomed. 2019 Aug;177:9-15. doi: 10.1016/j.cmpb.2019.05.005. Epub 2019 May 13.
Patients with End- Stage Kidney Disease (ESKD) have a unique cardiovascular risk. This study aims at predicting, with a certain precision, death and cardiovascular diseases in dialysis patients.
To achieve our aim, machine learning techniques have been used. Two datasets have been taken into consideration: the first is an Italian dataset obtained from the Istituto di Fisiologia Clinica of Consiglio Nazionale delle Ricerche of Reggio Calabria; the second is an American dataset provided by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) repository. From each one we obtained 5 datasets, according to the outcome of interest. We tested different types of algorithm (both linear and non-linear), but the final choice was to use Support Vector Machine. In particular, we obtained the best performances using the non-linear SVC with RBF kernel algorithm, optimizing it with GridSearch. The last is an algorithm useful to search the best combination of hyper-parameters (in our case, to find the best couple (C, γ)), in order to improve the accuracy of the algorithm.
The use of non-linear SVC with RBF kernel algorithm, optimized with GridSearch, allowed to obtain an accuracy of 95.25% in the Italian dataset and of 92.15% in the American dataset, in a timeframe of 2.5 years,in the prediction of Ischaemic Heart Disease. A worse performance was obtained for the other outcomes.
The machine learning-based approach applied in our study is able to predict, with a high accuracy, the outbreak of cardiovascular diseases in patients on dialysis.
终末期肾病(ESKD)患者具有独特的心血管风险。本研究旨在以一定的精度预测透析患者的死亡和心血管疾病。
为了实现我们的目标,我们使用了机器学习技术。我们考虑了两个数据集:第一个是来自意大利雷焦卡拉布里亚的国家研究委员会生理研究所的数据集;第二个是由美国国立糖尿病、消化和肾脏疾病研究所(NIDDK)提供的数据集。我们从每一个数据集中获得了 5 个数据集,根据感兴趣的结果。我们测试了不同类型的算法(线性和非线性),但最终选择使用支持向量机。特别是,我们使用带有 RBF 核的非线性 SVC 算法获得了最佳性能,并使用 GridSearch 对其进行了优化。最后是一种用于搜索最佳超参数组合的算法(在我们的案例中,找到最佳的(C,γ)对),以提高算法的准确性。
使用带有 RBF 核的非线性 SVC 算法,并使用 GridSearch 进行优化,在意大利数据集和美国数据集的 2.5 年时间内,用于预测缺血性心脏病,可获得 95.25%的准确度。对于其他结果,性能较差。
本研究中应用的基于机器学习的方法能够以较高的精度预测透析患者心血管疾病的爆发。