Ali Sikandar, Hussain Ali, Aich Satyabrata, Park Moo Suk, Chung Man Pyo, Jeong Sung Hwan, Song Jin Woo, Lee Jae Ha, Kim Hee Cheol
Institute of Digital Anti-Aging Healthcare, Inje University, Gimhae 50834, Korea.
Department of Computer Engineering, Institute of Digital Anti-Aging Healthcare, Inje University, Gimhae 50834, Korea.
Life (Basel). 2021 Oct 15;11(10):1092. doi: 10.3390/life11101092.
Idiopathic pulmonary fibrosis, which is one of the lung diseases, is quite rare but fatal in nature. The disease is progressive, and detection of severity takes a long time as well as being quite tedious. With the advent of intelligent machine learning techniques, and also the effectiveness of these techniques, it was possible to detect many lung diseases. So, in this paper, we have proposed a model that could be able to detect the severity of IPF at the early stage so that fatal situations can be controlled. For the development of this model, we used the IPF dataset of the Korean interstitial lung disease cohort data. First, we preprocessed the data while applying different preprocessing techniques and selected 26 highly relevant features from a total of 502 features for 2424 subjects. Second, we split the data into 80% training and 20% testing sets and applied oversampling on the training dataset. Third, we trained three state-of-the-art machine learning models and combined the results to develop a new soft voting ensemble-based model for the prediction of severity of IPF disease in patients with this chronic lung disease. Hyperparameter tuning was also performed to get the optimal performance of the model. Fourth, the performance of the proposed model was evaluated by calculating the accuracy, AUC, confusion matrix, precision, recall, and F1-score. Lastly, our proposed soft voting ensemble-based model achieved the accuracy of 0.7100, precision 0.6400, recall 0.7100, and F1-scores 0.6600. This proposed model will help the doctors, IPF patients, and physicians to diagnose the severity of the IPF disease in its early stages and assist them to take proactive measures to overcome this disease by enabling the doctors to take necessary decisions pertaining to the treatment of IPF disease.
特发性肺纤维化是肺部疾病之一,相当罕见但本质上具有致命性。该疾病呈进行性发展,严重程度的检测耗时久且十分繁琐。随着智能机器学习技术的出现及其有效性,得以检测出多种肺部疾病。因此,在本文中,我们提出了一种模型,该模型能够在早期阶段检测出特发性肺纤维化的严重程度,从而控制致命情况。为了开发此模型,我们使用了韩国间质性肺病队列数据中的特发性肺纤维化数据集。首先,我们在应用不同预处理技术的同时对数据进行预处理,并从2424名受试者的总共502个特征中选择了26个高度相关的特征。其次,我们将数据分为80%的训练集和20%的测试集,并对训练数据集进行过采样。第三,我们训练了三种先进的机器学习模型,并将结果相结合,以开发一种基于软投票集成的新模型,用于预测这种慢性肺病患者的特发性肺纤维化疾病严重程度。还进行了超参数调整以获得模型的最佳性能。第四,通过计算准确率、AUC、混淆矩阵、精确率、召回率和F1分数来评估所提出模型的性能。最后,我们提出的基于软投票集成的模型实现了0.7100的准确率、0.6400的精确率、0.7100的召回率和0.6600的F1分数。该模型将帮助医生、特发性肺纤维化患者和内科医生在早期阶段诊断特发性肺纤维化疾病的严重程度,并通过使医生能够做出与特发性肺纤维化疾病治疗相关的必要决策,协助他们采取积极措施来战胜这种疾病。