Wei Jia, Yuan Kevin, Luk Augustine, Walker A Sarah, Eyre David W
Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom.
PLOS Digit Health. 2025 Jul 21;4(7):e0000936. doi: 10.1371/journal.pdig.0000936. eCollection 2025 Jul.
Community-acquired pneumonia (CAP) is common and a significant cause of mortality. However, CAP surveillance commonly relies on diagnostic codes from electronic health records (EHRs), with imperfect accuracy. We used Bayesian latent class models with multiple imputation to assess the accuracy of CAP diagnostic codes in the absence of a gold standard and to explore the contribution of various EHR data sources in improving CAP identification. Using 491,681 hospital admissions in Oxfordshire, UK, from 2016 to 2023, we investigated four EHR-based algorithms for CAP detection based on 1) primary diagnostic codes, 2) clinician-documented indications for antibiotic prescriptions, 3) radiology free-text reports, and 4) vital signs and blood tests. The estimated prevalence of CAP as the reason for emergency hospital admission was 13.6% (95% credible interval 13.3-14.0%). Primary diagnostic codes had low sensitivity but a high specificity (best fitting model, 0.275 and 0.997 respectively), as did vital signs with blood tests (0.348 and 0.963). Antibiotic indication text had a higher sensitivity (0.590) but a lower specificity (0.982), with radiology reports intermediate (0.485 and 0.960). Defining CAP as present when detected by any algorithm produced sensitivity and specificity of 0.873 and 0.905 respectively. Results remained consistent using alternative priors and in sensitivity analyses. Relying solely on diagnostic codes for CAP surveillance leads to substantial under-detection; combining EHR data across multiple algorithms enhances identification accuracy. Bayesian latent class analysis-based approaches could improve CAP surveillance and epidemiological estimates by integrating multiple EHR sources, even without a gold standard for CAP diagnosis.
社区获得性肺炎(CAP)很常见,是一个重要的死亡原因。然而,CAP监测通常依赖电子健康记录(EHR)中的诊断代码,其准确性并不完美。我们使用带有多重填补的贝叶斯潜在类别模型,在没有金标准的情况下评估CAP诊断代码的准确性,并探索各种EHR数据源在改善CAP识别方面的作用。利用英国牛津郡2016年至2023年的491,681例住院病例,我们研究了四种基于EHR的CAP检测算法,分别基于:1)主要诊断代码;2)临床医生记录的抗生素处方指征;3)放射学自由文本报告;4)生命体征和血液检查。估计因急诊入院的CAP患病率为13.6%(95%可信区间13.3 - 14.0%)。主要诊断代码的敏感性较低但特异性较高(最佳拟合模型分别为0.275和0.997),生命体征与血液检查也是如此(0.348和0.963)。抗生素指征文本的敏感性较高(0.590)但特异性较低(0.982),放射学报告则介于两者之间(0.485和0.960)。当通过任何一种算法检测到CAP时将其定义为存在,其敏感性和特异性分别为0.873和0.905。使用替代先验和进行敏感性分析时结果保持一致。仅依靠诊断代码进行CAP监测会导致大量漏诊;整合多种算法的EHR数据可提高识别准确性。基于贝叶斯潜在类别分析的方法即使在没有CAP诊断金标准的情况下,通过整合多个EHR来源也可以改善CAP监测和流行病学估计。
Cochrane Database Syst Rev. 2022-5-20
Cochrane Database Syst Rev. 2022-3-2
Cochrane Database Syst Rev. 2022-11-17
Cochrane Database Syst Rev. 2018-1-22
Cochrane Database Syst Rev. 2022-7-22
Cochrane Database Syst Rev. 2024-12-16
Cochrane Database Syst Rev. 2024-10-17
Cochrane Database Syst Rev. 2021-9-23
Cochrane Database Syst Rev. 2022-3-23
Commun Med (Lond). 2025-3-21
Rev Assoc Med Bras (1992). 2024
Antimicrob Steward Healthc Epidemiol. 2024-4-19
Ann Transl Med. 2020-5
Infect Dis Clin North Am. 2019-12