Diciolla M, Binetti G, Di Noia T, Pesce F, Schena F P, Vågane A M, Bjørneklett R, Suzuki H, Tomino Y, Naso D
Department of Electrical and Information Engineering, Polytechnic University of Bari, Bari, Italy.
Department of Electrical and Information Engineering, Polytechnic University of Bari, Bari, Italy.
Comput Biol Med. 2015 Nov 1;66:278-86. doi: 10.1016/j.compbiomed.2015.09.003. Epub 2015 Sep 25.
IgA Nephropathy (IgAN) is a common kidney disease which may entail renal failure, known as End Stage Kidney Disease (ESKD). One of the major difficulties dealing with this disease is to predict the time of the long-term prognosis for a patient at the time of diagnosis. In fact, the progression of IgAN to ESKD depends on an intricate interrelationship between clinical and laboratory findings. Therefore, the objective of this work has been the selection of the best data mining tool to build a model able to predict (I) if a patient with a biopsy proven IgAN will reach ESKD and (II) if a patient will reach the ESKD before or after 5 years.
The largest available cohort study worldwide on IgAN has been used to design and compare several data-driven models. The complete dataset was composed of 1174 records collected from Italian, Norwegian, and Japanese IgAN patients, in the last 30 years. The data mining tools considered in this work were artificial neural networks (ANNs), neuro fuzzy systems (NFSs), support vector machines (SVMs), and decision trees (DTs). A 10-fold cross validation was used to evaluate unbiased performances for all the models.
An extensive model comparison based on accuracy, precision, recall, and f-measure was provided. Overall, the results indicate that ANNs can provide superior performance compared to the other models. The ANN for time-to-ESKD prediction is characterized by accuracy, precision, recall, and f-measure greater than 90%. The ANN for ESKD prediction has accuracy greater than 90% as well as precision, recall, and f-measure for the class of patients not reaching ESKD, while precision, recall, and f-measure for the class of patients reaching ESKD are slightly lower. The obtained model has been implemented in a Web-based decision support system (DSS).
The extraction of novel knowledge from clinical data and the definition of predictive models to support diagnosis, prognosis, and therapy is becoming an essential tool for researchers and clinical practitioners in medicine. The proposed comparative study of several data mining models for the outcome prediction in IgAN patients, using a large dataset of clinical records from three different countries, provides an insight into the relative prediction ability of the considered methods applied to such a disease.
IgA 肾病(IgAN)是一种常见的肾脏疾病,可能导致肾衰竭,即终末期肾病(ESKD)。处理这种疾病的主要困难之一是在诊断时预测患者长期预后的时间。事实上,IgAN 向 ESKD 的进展取决于临床和实验室检查结果之间复杂的相互关系。因此,这项工作的目标是选择最佳的数据挖掘工具来构建一个模型,该模型能够预测:(I)经活检证实患有 IgAN 的患者是否会发展为 ESKD;(II)患者是否会在 5 年之前或之后发展为 ESKD。
全球最大的关于 IgAN 的队列研究被用于设计和比较多个数据驱动模型。完整的数据集由过去 30 年从意大利、挪威和日本的 IgAN 患者中收集的 1174 条记录组成。本研究中考虑的数据挖掘工具包括人工神经网络(ANN)、神经模糊系统(NFS)、支持向量机(SVM)和决策树(DT)。采用 10 折交叉验证来评估所有模型的无偏性能。
提供了基于准确率、精确率、召回率和 F1 值的广泛模型比较。总体而言,结果表明 ANN 与其他模型相比能提供更优的性能。用于预测 ESKD 时间的 ANN 的准确率、精确率、召回率和 F1 值均大于 90%。用于预测 ESKD 的 ANN 的准确率大于 90%,对于未发展为 ESKD 的患者类别,其精确率、召回率和 F1 值也较高,而对于发展为 ESKD 的患者类别,其精确率、召回率和 F1 值略低。所获得的模型已在基于网络的决策支持系统(DSS)中实现。
从临床数据中提取新知识并定义预测模型以支持诊断、预后和治疗,正成为医学研究人员和临床医生的重要工具。本研究使用来自三个不同国家的大量临床记录数据集,对几种用于 IgAN 患者结局预测的数据挖掘模型进行了比较研究,深入了解了所考虑方法对此类疾病的相对预测能力。