Kim Woo Jung, Sung Ji Min, Sung David, Chae Myeong-Hun, An Suk Kyoon, Namkoong Kee, Lee Eun, Chang Hyuk-Jae
Department of Psychiatry, Myongji Hospital, Hanyang University College of Medicine, Goyang, Republic of Korea.
Institute of Behavioral Science in Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea.
JMIR Med Inform. 2019 Aug 30;7(3):e13139. doi: 10.2196/13139.
With the increase in the world's aging population, there is a growing need to prevent and predict dementia among the general population. The availability of national time-series health examination data in South Korea provides an opportunity to use deep learning algorithm, an artificial intelligence technology, to expedite the analysis of mass and sequential data.
This study aimed to compare the discriminative accuracy between a time-series deep learning algorithm and conventional statistical methods to predict all-cause dementia and Alzheimer dementia using periodic health examination data.
Diagnostic codes in medical claims data from a South Korean national health examination cohort were used to identify individuals who developed dementia or Alzheimer dementia over a 10-year period. As a result, 479,845 and 465,081 individuals, who were aged 40 to 79 years and without all-cause dementia and Alzheimer dementia, respectively, were identified at baseline. The performance of the following 3 models was compared with predictions of which individuals would develop either type of dementia: Cox proportional hazards model using only baseline data (HR-B), Cox proportional hazards model using repeated measurements (HR-R), and deep learning model using repeated measurements (DL-R).
The discrimination indices (95% CI) for the HR-B, HR-R, and DL-R models to predict all-cause dementia were 0.84 (0.83-0.85), 0.87 (0.86-0.88), and 0.90 (0.90-0.90), respectively, and those to predict Alzheimer dementia were 0.87 (0.86-0.88), 0.90 (0.88-0.91), and 0.91 (0.91-0.91), respectively. The DL-R model showed the best performance, followed by the HR-R model, in predicting both types of dementia. The DL-R model was superior to the HR-R model in all validation groups tested.
A deep learning algorithm using time-series data can be an accurate and cost-effective method to predict dementia. A combination of deep learning and proportional hazards models might help to enhance prevention strategies for dementia.
随着全球老龄化人口的增加,普通人群中预防和预测痴呆症的需求日益增长。韩国全国时间序列健康检查数据的可用性为使用深度学习算法(一种人工智能技术)加快对大量连续数据的分析提供了契机。
本研究旨在比较时间序列深度学习算法与传统统计方法在使用定期健康检查数据预测全因性痴呆和阿尔茨海默病痴呆方面的判别准确性。
来自韩国全国健康检查队列的医疗理赔数据中的诊断代码用于识别在10年期间发生痴呆或阿尔茨海默病痴呆的个体。结果,在基线时分别识别出479,845名年龄在40至79岁之间且无全因性痴呆的个体和465,081名无阿尔茨海默病痴呆的个体。将以下3种模型的性能与预测哪些个体将发生这两种痴呆类型的结果进行了比较:仅使用基线数据的Cox比例风险模型(HR-B)、使用重复测量的Cox比例风险模型(HR-R)以及使用重复测量的深度学习模型(DL-R)。
HR-B、HR-R和DL-R模型预测全因性痴呆的判别指数(95%CI)分别为0.84(0.83 - 0.85)、0.87(0.86 - 0.88)和0.90(0.90 - 0.90),预测阿尔茨海默病痴呆的判别指数分别为0.87(0.86 - 0.88)、0.90(0.88 - 0.91)和0.91(0.91 - 0.91)。在预测这两种痴呆类型方面,DL-R模型表现最佳,其次是HR-R模型。在所有测试的验证组中,DL-R模型均优于HR-R模型。
使用时间序列数据的深度学习算法可能是一种准确且具有成本效益的痴呆预测方法。深度学习与比例风险模型的结合可能有助于加强痴呆的预防策略。