Tao Shiqiang, Abeysinghe Rashmie, De La Esperanza Blanca Talavera, Lhatoo Samden, Zhang Guo-Qiang, Cui Licong
Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX.
School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX.
AMIA Jt Summits Transl Sci Proc. 2023 Jun 16;2023:515-524. eCollection 2023.
Early onset of seizure is a potential risk factor for Sudden Unexpected Death in Epilepsy (SUDEP). However, the first seizure onset information is often documented as clinical narratives in epilepsy monitoring unit (EMU) discharge summaries. Manually extracting first seizure onset time from discharge summaries is time consuming and labor-intensive. In this work, we developed a rule-based natural language processing pipeline for automatically extracting the temporal information of patients' first seizure onset from EMU discharge summaries. We use the Epilepsy and Seizure Ontology (EpSO) as the core knowledge resource and construct 4 extraction rules based on 300 randomly selected EMU discharge summaries. To evaluate the effectiveness of the extraction pipeline, we apply the constructed rules on another 200 unseen discharge summaries and compare the results against the manual evaluation of a domain expert. Overall, our extraction pipeline achieved a precision of 0.75, recall of 0.651, and F1-score of 0.697. This is an encouraging initial result which will allow us to gain insights into potentially better-performing approaches.
癫痫发作的早期发生是癫痫性猝死(SUDEP)的一个潜在风险因素。然而,首次癫痫发作的起始信息在癫痫监测单元(EMU)出院小结中通常记录为临床叙述。从出院小结中手动提取首次癫痫发作的起始时间既耗时又费力。在这项工作中,我们开发了一个基于规则的自然语言处理管道,用于从EMU出院小结中自动提取患者首次癫痫发作的时间信息。我们使用癫痫与发作本体(EpSO)作为核心知识资源,并基于300篇随机选择的EMU出院小结构建了4条提取规则。为了评估提取管道的有效性,我们将构建的规则应用于另外200篇未见过的出院小结,并将结果与领域专家的人工评估进行比较。总体而言,我们的提取管道的精确率为0.75,召回率为0.651,F1分数为0.697。这是一个令人鼓舞的初步结果,将使我们能够深入了解可能性能更好的方法。