Lee Hsiu-An, Chao Louis R, Hsu Chien-Yeh
Department of Computer Science and Information Engineering, Tamkang University, New Taipei 251, Taiwan.
National Health Research Institutes, Zhunan 350, Taiwan.
Cancers (Basel). 2021 Feb 23;13(4):928. doi: 10.3390/cancers13040928.
Cancer is the leading cause of death in Taiwan. According to the Cancer Registration Report of Taiwan's Ministry of Health and Welfare, a total of 13,488 people suffered from lung cancer in 2016, making it the second-most common cancer and the leading cancer in men. Compared with other types of cancer, the incidence of lung cancer is high. In this study, the National Health Insurance Research Database (NHIRDB) was used to determine the diseases and symptoms associated with lung cancer, and a 10-year probability deep neural network prediction model for lung cancer was developed. The proposed model could allow patients with a high risk of lung cancer to receive an earlier diagnosis and support the physicians' clinical decision-making. The study was designed as a cohort study. The subjects were patients who were diagnosed with lung cancer between 2000 and 2009, and the patients' disease histories were back-tracked for a period, extending to ten years before the diagnosis of lung cancer. As a result, a total of 13 diseases were selected as the predicting factors. A nine layers deep neural network model was created to predict the probability of lung cancer, depending on the different pre-diagnosed diseases, and to benefit the earlier detection of lung cancer in potential patients. The model is trained 1000 times, the batch size is set to 100, the SGD (Stochastic gradient descent) optimizer is used, the learning rate is set to 0.1, and the momentum is set to 0.1. The proposed model showed an accuracy of 85.4%, a sensitivity of 72.4% and a specificity of 85%, as well as an 87.4% area under ROC (AUROC) (95%, 0.8604-0.8885) model precision. Based on data analysis and deep learning, our prediction model discovered some features that had not been previously identified by clinical knowledge. This study tracks a decade of clinical diagnostic records to identify possible symptoms and comorbidities of lung cancer, allows early prediction of the disease, and assists more patients with early diagnosis.
癌症是台湾地区的主要死因。根据台湾地区卫生福利部癌症登记报告,2016年共有13488人罹患肺癌,使其成为第二大常见癌症及男性主要癌症。与其他类型癌症相比,肺癌发病率较高。在本研究中,使用了全民健康保险研究数据库(NHIRDB)来确定与肺癌相关的疾病和症状,并开发了一种用于肺癌的10年概率深度神经网络预测模型。所提出的模型可使肺癌高危患者获得更早诊断,并支持医生的临床决策。该研究设计为队列研究。研究对象为2000年至2009年间被诊断为肺癌的患者,对患者的病史进行了一段时间的回溯,追溯至肺癌诊断前十年。结果,共选择了13种疾病作为预测因素。创建了一个九层深度神经网络模型,根据不同的诊断前疾病来预测肺癌发生概率,以利于早期发现潜在患者的肺癌。该模型训练1000次,批量大小设置为100,使用随机梯度下降(SGD)优化器,学习率设置为0.1,动量设置为0.1。所提出的模型显示准确率为85.4%,灵敏度为72.4%,特异性为85%,以及受试者工作特征曲线下面积(AUROC)为87.4%(95%,0.