Graduate School of Computer Science, Nihon University, Koriyama, Japan.
Department of Computer Science, College of Engineering, Nihon University, Koriyama, Japan.
Adv Exp Med Biol. 2024;1463:215-219. doi: 10.1007/978-3-031-67458-7_36.
This study investigates the effectiveness of data augmentation to improve dementia risk prediction using deep neural networks (DNNs). Previous research has shown that basic blood test data were cost-effective and crucial in predicting cognitive function, as indicated by mini-mental state examination (MMSE) scores. However, creating models that can accommodate various conditions is a significant challenge due to constraints related to blood test and MMSE results, such as high costs, limited sample size, and missing data from specific tests not conducted in certain facilities. Periodontal examinations have also emerged as a cost-effective tool for mass screening. To address these issues, this study explores the use of generative adversarial networks (GANs) for generating synthesised data from blood test and periodontal examination results. We used DNNs with four hidden layers to compare prediction accuracy between real and GAN-synthesised data from 108 participants at Nihon University Itabashi Hospital. The GAN-synthesised DNNs achieved a mean absolute error (MAE) of 1.91 ± 0.30 compared to 2.04 ± 0.37 for real data, indicating improved accuracy with synthesised data. Importantly, synthesised data showcased enhanced robustness against missing important variables including age information, and better managed data imbalances. Considering the difficulties in amassing extensive medical data, the augmentation approach is promising in refining dementia risk prediction.
本研究旨在探讨数据增强在使用深度神经网络(DNN)进行痴呆风险预测中的有效性。先前的研究表明,基本的血液测试数据在预测认知功能方面具有成本效益,并且非常关键,其预测结果可以通过简易精神状态检查(MMSE)评分来体现。然而,由于与血液测试和 MMSE 结果相关的限制,如成本高、样本量有限以及特定测试数据缺失(在某些机构中并未开展这些特定测试),创建能够适应各种条件的模型是一个重大挑战。牙周检查也已成为大规模筛查的一种具有成本效益的工具。为了解决这些问题,本研究探讨了使用生成对抗网络(GAN)从血液测试和牙周检查结果中生成合成数据。我们使用具有四个隐藏层的 DNN 来比较来自日本大学板桥医院的 108 名参与者的真实数据和 GAN 合成数据的预测准确性。与真实数据相比,GAN 合成的 DNN 的平均绝对误差(MAE)为 1.91±0.30,表明使用合成数据可提高准确性。重要的是,合成数据在处理包括年龄信息在内的重要变量缺失以及更好地管理数据不平衡方面表现出了更好的稳健性。考虑到积累广泛的医疗数据存在困难,该增强方法在改进痴呆风险预测方面具有广阔的前景。