Tinker Rory J, Peterson Josh, Bastarache Lisa
medRxiv. 2023 Jan 18:2023.01.17.23284691. doi: 10.1101/2023.01.17.23284691.
The study of Mendelian disease has yielded a large body of knowledge about the phenotypic presentation of disease. Less is known about the way the diseases are reflected in the electronic health record (EHR).
To develop an EHR-based model of the diagnostic trajectory and investigate data availability and the longitudinal distribution of signs and symptoms of a Mendelian disorder within EHRs.
We created a conceptual model to specify key time points of the diagnostic trajectory and applied it to individuals with genetically confirmed hereditary connective tissue diseases (HCTD). Using the model, we assessed EHR data availability within each time interval. We tested the performance of phenotype risk scores (PheRS), an algorithm that detects Mendelian disease patterns and assessed the phenotypic expression of HCTD over the diagnostic trajectory.
We identified 251 individuals with HCTD; 79 (35%) of these patients had a fully ascertained diagnostic trajectory. There were few documented signs and symptoms prior to clinical suspicion that evoked an HCTD disorder (median PheRS 0.14); once suspicion was documented, median PheRS increased to 1.87 (SD). The majority (72%) of phenotypic features were identified post clinical suspicion.
Using a novel conceptual model for the diagnostic trajectory of Mendelian disease, we demonstrated that phenotype ascertainment is, in part, driven by the diagnostic process and that many findings are only documented following clinical suspicion and diagnosis, a process we term phenotypic convergence. Therefore, algorithms that aim to detect undiagnosed Mendelian disease should censor EHR data to avoid data leakage.
孟德尔疾病的研究已经产生了大量关于疾病表型表现的知识。对于这些疾病在电子健康记录(EHR)中的反映方式,我们了解得较少。
开发一种基于电子健康记录的诊断轨迹模型,并研究数据可用性以及孟德尔疾病的体征和症状在电子健康记录中的纵向分布。
我们创建了一个概念模型来确定诊断轨迹的关键时间点,并将其应用于基因确诊的遗传性结缔组织疾病(HCTD)患者。使用该模型,我们评估了每个时间间隔内电子健康记录数据的可用性。我们测试了表型风险评分(PheRS)的性能,这是一种检测孟德尔疾病模式的算法,并评估了HCTD在诊断轨迹上的表型表达。
我们确定了251例HCTD患者;其中79例(35%)患者有完整确定的诊断轨迹。在临床怀疑引发HCTD疾病之前,记录的体征和症状很少(PheRS中位数为0.14);一旦记录了怀疑,PheRS中位数增加到1.87(标准差)。大多数(72%)表型特征是在临床怀疑之后确定的。
使用一种新颖的孟德尔疾病诊断轨迹概念模型,我们证明表型确定部分是由诊断过程驱动的,并且许多发现仅在临床怀疑和诊断之后才被记录,我们将这个过程称为表型趋同。因此,旨在检测未确诊孟德尔疾病的算法应该审查电子健康记录数据以避免数据泄露。