Chen Songjing, Wu Sizhu
Institute of Medical Information and Library, Chinese Academy of Medical Sciences / Peking Union Medical College, Beijing, China.
J Med Internet Res. 2020 Mar 17;22(3):e17695. doi: 10.2196/17695.
Lung cancer is one of the most dangerous malignant tumors, with the fastest-growing morbidity and mortality, especially in the elderly. With a rapid growth of the elderly population in recent years, lung cancer prevention and control are increasingly of fundamental importance, but are complicated by the fact that the pathogenesis of lung cancer is a complex process involving a variety of risk factors.
This study aimed at identifying key risk factors of lung cancer incidence in the elderly and quantitatively analyzing these risk factors' degree of influence using a deep learning method.
Based on Web-based survey data, we integrated multidisciplinary risk factors, including behavioral risk factors, disease history factors, environmental factors, and demographic factors, and then preprocessed these integrated data. We trained deep neural network models in a stratified elderly population. We then extracted risk factors of lung cancer in the elderly and conducted quantitative analyses of the degree of influence using the deep neural network models.
The proposed model quantitatively identified risk factors based on 235,673 adults. The proposed deep neural network models of 4 groups (age ≥65 years, women ≥65 years old, men ≥65 years old, and the whole population) achieved good performance in identifying lung cancer risk factors, with accuracy ranging from 0.927 (95% CI 0.223-0.525; P=.002) to 0.962 (95% CI 0.530-0.751; P=.002) and the area under curve ranging from 0.913 (95% CI 0.564-0.803) to 0.931(95% CI 0.499-0.593). Smoking frequency was the leading risk factor for lung cancer in men 65 years and older. Time since quitting and smoking at least 100 cigarettes in their lifetime were the main risk factors for lung cancer in women 65 years and older. Men 65 years and older had the highest lung cancer incidence among the stratified groups, particularly non-small cell lung cancer incidence. Lung cancer incidence decreased more obviously in men than in women with smoking rate decline.
This study demonstrated a quantitative method to identify risk factors of lung cancer in the elderly. The proposed models provided intervention indicators to prevent lung cancer, especially in older men. This approach might be used as a risk factor identification tool to apply in other cancers and help physicians make decisions on cancer prevention.
肺癌是最危险的恶性肿瘤之一,其发病率和死亡率增长最快,在老年人中尤为如此。近年来,随着老年人口的快速增长,肺癌的预防和控制变得越来越重要,但肺癌的发病机制是一个涉及多种危险因素的复杂过程,这使得相关工作变得复杂。
本研究旨在识别老年人肺癌发病的关键危险因素,并使用深度学习方法对这些危险因素的影响程度进行定量分析。
基于网络调查数据,我们整合了多学科危险因素,包括行为危险因素、疾病史因素、环境因素和人口统计学因素,然后对这些整合后的数据进行预处理。我们在分层的老年人群中训练深度神经网络模型。然后,我们提取老年人肺癌的危险因素,并使用深度神经网络模型对影响程度进行定量分析。
所提出的模型基于235,673名成年人定量识别了危险因素。所提出的4组深度神经网络模型(年龄≥65岁、65岁及以上女性、65岁及以上男性和全体人群)在识别肺癌危险因素方面表现良好,准确率范围为0.927(95%CI 0.223 - 0.525;P = 0.002)至0.962(95%CI 0.530 - 0.751;P = 0.002),曲线下面积范围为0.913(95%CI 0.564 - 0.803)至0.931(95%CI 0.499 - 0.593)。吸烟频率是65岁及以上男性肺癌的主要危险因素。戒烟时间和一生中至少吸烟100支是65岁及以上女性肺癌的主要危险因素。65岁及以上男性在分层组中肺癌发病率最高,尤其是非小细胞肺癌发病率。随着吸烟率下降,男性肺癌发病率的下降比女性更明显。
本研究展示了一种定量识别老年人肺癌危险因素的方法。所提出的模型为预防肺癌提供了干预指标,尤其是在老年男性中。这种方法可作为一种危险因素识别工具应用于其他癌症,并帮助医生做出癌症预防决策。