Lo Barco Tommaso, Kuchenbuch Mathieu, Garcelon Nicolas, Neuraz Antoine, Nabbout Rima
Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France.
Child Neuropsychiatry, Department of Surgical Sciences, Dentistry, Gynecology and Pediatrics, University of Verona, Verona, Italy.
Orphanet J Rare Dis. 2021 Jul 13;16(1):309. doi: 10.1186/s13023-021-01936-9.
The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care.
Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions.
We found significative higher representation of concepts related to seizures' phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders.
Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.
电子健康记录(EHRs)的使用日益广泛,推动了数据挖掘在医疗保健领域的应用。大数据在该领域的一个有前景的用途是开发模型以支持早期诊断并建立自然病史。德雷维特综合征(DS)是一种罕见的发育性和癫痫性脑病,通常在生命的第一年以热性惊厥(FS)起病。诊断年龄往往在2岁后延迟,因为在发病时很难将DS与FS区分开来。我们旨在探讨与热性惊厥患者相比,2岁前患有DS的个体的电子叙述性医疗报告中是否有某些临床术语(概念)的使用频率显著更高。这些概念将有助于更早地发现DS患者,从而更早地将其转诊至能够提供早期诊断和治疗的专家中心。
数据从内克尔儿童医院收集,使用基于文档的数据仓库Dr Warehouse,该仓库采用自然语言处理技术,这是一种处理书面信息的计算机技术。使用统一医学语言系统元词表,可以在医疗报告中识别表型概念。我们选择了4岁后确诊的DS患者(DS队列)和FS患者(FS队列)。基于2岁前生成的报告中发现的概念,并使用一系列逻辑回归进行全表型分析,评估DS和FS表型之间的统计关联。
我们发现,在第一阶段,与区分DS和FS的癫痫表型相关的概念有显著更高的呈现,即复杂性热性惊厥的主要复发(持续时间长和/或有局灶性体征)和其他癫痫类型。还出现了一些典型的早期发作非癫痫概念,与神经发育和步态障碍有关。
2岁以下FS患者的叙述性医疗报告包含与DS诊断相关的特定概念,软件利用自然语言处理可以自动检测到这些概念。这种方法可能代表一种创新且可持续的方法,以缩短DS的诊断时间,并且可以应用于其他罕见疾病。