Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA; University of Pennsylvania, Center for Neuroengineering and Therapeutics, Philadelphia, PA.
Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA.
Genet Med. 2024 Nov;26(11):101211. doi: 10.1016/j.gim.2024.101211. Epub 2024 Jul 14.
An early genetic diagnosis can guide the time-sensitive treatment of individuals with genetic epilepsies. However, most genetic diagnoses occur long after disease onset. We aimed to identify early clinical features suggestive of genetic diagnoses in individuals with epilepsy through large-scale analysis of full-text electronic medical records.
We extracted 89 million time-stamped standardized clinical annotations using Natural Language Processing from 4,572,783 clinical notes from 32,112 individuals with childhood epilepsy, including 1925 individuals with known or presumed genetic epilepsies. We applied these features to train random forest models to predict SCN1A-related disorders and any genetic diagnosis.
We identified 47,774 age-dependent associations of clinical features with genetic etiologies a median of 3.6 years before molecular diagnosis. Across all 710 genetic etiologies identified in our cohort, neurodevelopmental differences between 6 to 9 months increased the likelihood of a later molecular diagnosis 5-fold (P < .0001, 95% CI = 3.55-7.42). A later diagnosis of SCN1A-related disorders (area under the curve [AUC] = 0.91) or an overall positive genetic diagnosis (AUC = 0.82) could be reliably predicted using random forest models.
Clinical features predictive of genetic epilepsies precede molecular diagnoses by up to several years in conditions with known precision treatments. An earlier diagnosis facilitated by automated electronic medical records analysis has the potential for earlier targeted therapeutic strategies in the genetic epilepsies.
早期的基因诊断可以指导有遗传性癫痫的患者进行及时的治疗。然而,大多数基因诊断都是在疾病发作后很久才做出的。我们旨在通过对 32112 名患有儿童癫痫的个体的 4572783 份临床记录进行大规模的全文电子病历自然语言处理分析,来确定遗传性癫痫患者中提示基因诊断的早期临床特征。
我们使用自然语言处理从 32112 名患有儿童癫痫的个体的 4572783 份临床记录中提取了 8900 万条带有时间戳的标准化临床注释,其中包括 1925 名已知或疑似遗传性癫痫患者。我们将这些特征应用于随机森林模型,以预测 SCN1A 相关疾病和任何基因诊断。
我们确定了 47774 个年龄相关的临床特征与遗传病因的关联,这些关联平均在分子诊断前 3.6 年出现。在我们的队列中确定的 710 种遗传病因中,6 至 9 个月的神经发育差异使随后的分子诊断的可能性增加了 5 倍(P <.0001,95%置信区间为 3.55-7.42)。使用随机森林模型可以可靠地预测 SCN1A 相关疾病(曲线下面积[AUC]为 0.91)或整体阳性基因诊断(AUC 为 0.82)的诊断。
在具有明确精准治疗方案的情况下,预测遗传性癫痫的临床特征可在分子诊断前提前数年出现。通过自动化电子病历分析实现更早的诊断,有可能在遗传性癫痫中更早地采用靶向治疗策略。