Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Epilepsia. 2023 Jul;64(7):1900-1909. doi: 10.1111/epi.17633. Epub 2023 May 10.
Electronic medical records allow for retrospective clinical research with large patient cohorts. However, epilepsy outcomes are often contained in free text notes that are difficult to mine. We recently developed and validated novel natural language processing (NLP) algorithms to automatically extract key epilepsy outcome measures from clinic notes. In this study, we assessed the feasibility of extracting these measures to study the natural history of epilepsy at our center.
We applied our previously validated NLP algorithms to extract seizure freedom, seizure frequency, and date of most recent seizure from outpatient visits at our epilepsy center from 2010 to 2022. We examined the dynamics of seizure outcomes over time using Markov model-based probability and Kaplan-Meier analyses.
Performance of our algorithms on classifying seizure freedom was comparable to that of human reviewers (algorithm F = .88 vs. human annotator = .86). We extracted seizure outcome data from 55 630 clinic notes from 9510 unique patients written by 53 unique authors. Of these, 30% were classified as seizure-free since the last visit, 48% of non-seizure-free visits contained a quantifiable seizure frequency, and 47% of all visits contained the date of most recent seizure occurrence. Among patients with at least five visits, the probabilities of seizure freedom at the next visit ranged from 12% to 80% in patients having seizures or seizure-free at the prior three visits, respectively. Only 25% of patients who were seizure-free for 6 months remained seizure-free after 10 years.
Our findings demonstrate that epilepsy outcome measures can be extracted accurately from unstructured clinical note text using NLP. At our tertiary center, the disease course often followed a remitting and relapsing pattern. This method represents a powerful new tool for clinical research with many potential uses and extensions to other clinical questions.
电子病历允许对大量患者队列进行回顾性临床研究。然而,癫痫的结果通常包含在难以挖掘的自由文本记录中。我们最近开发并验证了新的自然语言处理(NLP)算法,以自动从诊所记录中提取关键的癫痫结果测量值。在这项研究中,我们评估了从我们中心提取这些措施来研究癫痫自然史的可行性。
我们应用之前验证过的 NLP 算法,从 2010 年至 2022 年,从我们的癫痫中心的门诊就诊记录中提取无癫痫发作、癫痫发作频率和最近一次癫痫发作的日期。我们使用基于马尔可夫模型的概率和 Kaplan-Meier 分析来检查随时间推移的癫痫发作结果的动态。
我们的算法在分类无癫痫发作的性能与人类审查员相当(算法 F =.88 与人类注释员 =.86)。我们从 9510 名患者的 53 名不同作者的 55630 份诊所记录中提取了癫痫结果数据。其中,30%的患者自上次就诊以来被归类为无癫痫发作,48%的非无癫痫发作就诊记录包含可量化的癫痫发作频率,47%的所有就诊记录包含最近一次癫痫发作的日期。在至少有五次就诊的患者中,在前三次就诊中有癫痫发作或无癫痫发作的患者,下一次就诊无癫痫发作的概率分别为 12%至 80%。在 6 个月内无癫痫发作的患者中,只有 25%在 10 年后仍无癫痫发作。
我们的研究结果表明,可以使用 NLP 从非结构化临床记录文本中准确提取癫痫结果测量值。在我们的三级中心,疾病过程通常遵循缓解和复发的模式。这种方法代表了一种强大的新工具,可用于临床研究,具有许多潜在的用途和扩展到其他临床问题。