Wan Cheng, Ge Xuewen, Wang Junjie, Zhang Xin, Yu Yun, Hu Jie, Liu Yun, Ma Hui
Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China.
Institute of Medical Informatics and Management, Nanjing Medical University, Nanjing, China.
Front Psychiatry. 2022 May 20;13:861930. doi: 10.3389/fpsyt.2022.861930. eCollection 2022.
Mood disorders are ubiquitous mental disorders with familial aggregation. Extracting family history of psychiatric disorders from large electronic hospitalization records is helpful for further study of onset characteristics among patients with a mood disorder. This study uses an observational clinical data set of in-patients of Nanjing Brain Hospital, affiliated with Nanjing Medical University, from the past 10 years. This paper proposes a pretrained language model: Bidirectional Encoder Representations from Transformers (BERT)-Convolutional Neural Network (CNN). We first project the electronic hospitalization records into a low-dimensional dense matrix the pretrained Chinese BERT model, then feed the dense matrix into the stacked CNN layer to capture high-level features of texts; finally, we use the fully connected layer to extract family history based on high-level features. The accuracy of our BERT-CNN model was 97.12 ± 0.37% in the real-world data set from Nanjing Brain Hospital. We further studied the correlation between mood disorders and family history of psychiatric disorder.
情绪障碍是具有家族聚集性的常见精神障碍。从大型电子住院记录中提取精神疾病家族史有助于进一步研究情绪障碍患者的发病特征。本研究使用了南京医科大学附属南京脑科医院过去10年住院患者的观察性临床数据集。本文提出了一种预训练语言模型:基于变换器的双向编码器表征(BERT)-卷积神经网络(CNN)。我们首先将电子住院记录通过预训练的中文BERT模型投影到低维密集矩阵中,然后将密集矩阵输入到堆叠的CNN层以捕获文本的高级特征;最后,我们使用全连接层基于高级特征提取家族史。在南京脑科医院的真实世界数据集中,我们的BERT-CNN模型的准确率为97.12±0.37%。我们进一步研究了情绪障碍与精神疾病家族史之间的相关性。