Walters Courtney E, Nitin Rachana, Margulis Katherine, Boorom Olivia, Gustavson Daniel E, Bush Catherine T, Davis Lea K, Below Jennifer E, Cox Nancy J, Camarata Stephen M, Gordon Reyna L
Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN.
Neuroscience Program, College of Arts and Science, Vanderbilt University, Nashville, TN.
J Speech Lang Hear Res. 2020 Sep 15;63(9):3019-3035. doi: 10.1044/2020_JSLHR-19-00397. Epub 2020 Aug 11.
Purpose Data mining algorithms using electronic health records (EHRs) are useful in large-scale population-wide studies to classify etiology and comorbidities (Casey et al., 2016). Here, we apply this approach to developmental language disorder (DLD), a prevalent communication disorder whose risk factors and epidemiology remain largely undiscovered. Method We first created a reliable system for manually identifying DLD in EHRs based on speech-language pathologist (SLP) diagnostic expertise. We then developed and validated an automated algorithmic procedure, called, Automated Phenotyping Tool for identifying DLD cases in health systems data (APT-DLD), that classifies a DLD status for patients within EHRs on the basis of ICD (International Statistical Classification of Diseases and Related Health Problems) codes. APT-DLD was validated in a discovery sample ( = 973) using expert SLP manual phenotype coding as a gold-standard comparison and then applied and further validated in a replication sample of = 13,652 EHRs. Results In the discovery sample, the APT-DLD algorithm correctly classified 98% (concordance) of DLD cases in concordance with manually coded records in the training set, indicating that APT-DLD successfully mimics a comprehensive chart review. The output of APT-DLD was also validated in relation to independently conducted SLP clinician coding in a subset of records, with a positive predictive value of 95% of cases correctly classified as DLD. We also applied APT-DLD to the replication sample, where it achieved a positive predictive value of 90% in relation to SLP clinician classification of DLD. Conclusions APT-DLD is a reliable, valid, and scalable tool for identifying DLD cohorts in EHRs. This new method has promising public health implications for future large-scale epidemiological investigations of DLD and may inform EHR data mining algorithms for other communication disorders. Supplemental Material https://doi.org/10.23641/asha.12753578.
目的 使用电子健康记录(EHR)的数据挖掘算法在大规模全人群研究中有助于对病因和合并症进行分类(凯西等人,2016年)。在此,我们将这种方法应用于发育性语言障碍(DLD),这是一种普遍存在的沟通障碍,其风险因素和流行病学情况在很大程度上仍未被发现。方法 我们首先基于言语语言病理学家(SLP)的诊断专业知识创建了一个可靠的系统,用于在电子健康记录中手动识别DLD。然后,我们开发并验证了一种自动化算法程序,称为用于在健康系统数据中识别DLD病例的自动化表型分析工具(APT-DLD),该程序根据国际疾病分类(ICD,国际疾病及相关健康问题统计分类)代码对电子健康记录中的患者进行DLD状态分类。APT-DLD在一个发现样本(n = 973)中进行了验证,使用专家SLP手动表型编码作为金标准对照,然后在一个包含13,652份电子健康记录的复制样本中应用并进一步验证。结果 在发现样本中,APT-DLD算法与训练集中的手动编码记录一致,正确分类了98%(一致性)的DLD病例,这表明APT-DLD成功模拟了全面的病历审查。APT-DLD的输出在一部分记录中与独立进行的SLP临床医生编码相关联进行了验证,正确分类为DLD的病例的阳性预测值为95%。我们还将APT-DLD应用于复制样本,在该样本中,相对于SLP临床医生对DLD的分类,其阳性预测值为90%。结论 APT-DLD是一种用于在电子健康记录中识别DLD队列的可靠、有效且可扩展的工具。这种新方法对未来DLD的大规模流行病学调查具有有前景的公共卫生意义,并且可能为其他沟通障碍的电子健康记录数据挖掘算法提供参考。补充材料 https://doi.org/10.23641/asha.12753578 。