IEEE J Biomed Health Inform. 2022 Apr;26(4):1737-1748. doi: 10.1109/JBHI.2021.3123192. Epub 2022 Apr 14.
Patients experience various symptoms when they haveeither acute or chronic diseases or undergo some treatments for diseases. Symptoms are often indicators of the severity of the disease and the need for hospitalization. Symptoms are often described in free text written as clinical notes in the Electronic Health Records (EHR) and are not integrated with other clinical factors for disease prediction and healthcare outcome management. In this research, we propose a novel deep language model to extract patient-reported symptoms from clinical text. The deep language model integrates syntactic and semantic analysis for symptom extraction and identifies the actual symptoms reported by patients and conditional or negation symptoms. The deep language model can extract both complex and straightforward symptom expressions. We used a real-world clinical notes dataset to evaluate our model and demonstrated that our model achieves superior performance compared to three other state-of-the-art symptom extraction models. We extensively analyzed our model to illustrate its effectiveness by examining each component's contribution to the model. Finally, we applied our model on a COVID-19 tweets data set to extract COVID-19 symptoms. The results show that our model can identify all the symptoms suggested by the Center for Disease Control (CDC) ahead of their timeline and many rare symptoms.
患者在患有急性或慢性疾病或接受某些疾病治疗时会经历各种症状。症状通常是疾病严重程度和住院需求的指标。症状通常以电子健康记录 (EHR) 中临床记录中编写的自由文本形式描述,并且未与其他临床因素集成以进行疾病预测和医疗保健结果管理。在这项研究中,我们提出了一种新颖的深度学习语言模型,用于从临床文本中提取患者报告的症状。该深度学习语言模型集成了句法和语义分析来进行症状提取,并识别出患者实际报告的症状以及条件或否定症状。该深度学习语言模型可以提取复杂和简单的症状表达。我们使用真实的临床记录数据集来评估我们的模型,并证明与其他三种最先进的症状提取模型相比,我们的模型具有卓越的性能。我们通过检查模型每个组件对模型的贡献,全面分析了我们的模型以说明其有效性。最后,我们将我们的模型应用于 COVID-19 推文数据集,以提取 COVID-19 症状。结果表明,我们的模型可以在疾病控制与预防中心 (CDC) 的时间线之前识别出所有建议的症状,以及许多罕见的症状。