Department of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, Minn; Asthma Epidemiology Research Unit, Mayo Clinic, Rochester, Minn.
Department of Health Sciences Research, Mayo Clinic, Rochester, Minn.
J Allergy Clin Immunol Pract. 2018 Jan-Feb;6(1):126-131. doi: 10.1016/j.jaip.2017.04.041. Epub 2017 Jun 19.
We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic.
To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity.
The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested.
Among the eligible test cohort (n = 297), 160 (53%) were males, 268 (90%) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25%) and 72 (24%) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92%, 96%, 89%, and 97%, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review.
Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research.
我们开发并验证了基于电子病历中预先设定的哮喘标准(PAC)的自然语言处理(NLP)算法 NLP-PAC,用于在梅奥诊所进行哮喘诊断。
通过评估其外部有效性,将 NLP-PAC 应用于不同的医疗保健环境(桑福德儿童医院)。
该研究设计为回顾性队列研究,使用桑福德出生队列(2011-2012 年)的随机样本(n=595)。根据 PAC,对队列进行了手动图表审查,以确定哮喘。然后,我们使用队列的一半作为训练队列(n=298),另一半作为盲测队列,以评估经过改编的 NLP-PAC 算法。测试了与已知哮喘相关的风险因素与桑福德-NLP 算法驱动的哮喘诊断之间的关联。
在合格的测试队列(n=297)中,160 名(53%)为男性,268 名(90%)为白人,中位年龄为 2.3 岁(范围,1.5-3.1 岁)。改编后的 NLP-PAC 和人工摘要分别识别了 74(25%)和 72(24%)的患者,其中 66 名患者通过两种方法识别。NLP 算法预测哮喘状态的敏感性、特异性、阳性预测值和阴性预测值分别为 92%、96%、89%和 97%。NLP 识别的哮喘的已知风险因素(例如,吸烟史)与手动图表审查识别的因素相似。
在 2 种不同的实践环境中成功实施 NLP-PAC 用于哮喘诊断,证明了利用电子病历数据进行自动哮喘诊断的可行性,有可能实现大规模、多站点的哮喘研究,以改善哮喘护理和研究。