Department of Medicine, University of California San Francisco, San Francisco, CA, United States of America.
Kaiser Permanente Medical Group, Kaiser Permanente Northern California, Oakland, CA, United States of America.
PLoS One. 2023 Mar 10;18(3):e0280342. doi: 10.1371/journal.pone.0280342. eCollection 2023.
Epidemiological studies of interstitial lung disease (ILD) are limited by small numbers and tertiary care bias. Investigators have leveraged the widespread use of electronic health records (EHRs) to overcome these limitations, but struggle to extract patient-level, longitudinal clinical data needed to address many important research questions. We hypothesized that we could automate longitudinal ILD cohort development using the EHR of a large, community-based healthcare system.
We applied a previously validated algorithm to the EHR of a community-based healthcare system to identify ILD cases between 2012-2020. We then extracted disease-specific characteristics and outcomes using fully automated data-extraction algorithms and natural language processing of selected free-text.
We identified a community cohort of 5,399 ILD patients (prevalence = 118 per 100,000). Pulmonary function tests (71%) and serologies (54%) were commonly used in the diagnostic evaluation, whereas lung biopsy was rare (5%). IPF was the most common ILD diagnosis (n = 972, 18%). Prednisone was the most commonly prescribed medication (911, 17%). Nintedanib and pirfenidone were rarely prescribed (n = 305, 5%). ILD patients were high-utilizers of inpatient (40%/year hospitalized) and outpatient care (80%/year with pulmonary visit), with sustained utilization throughout the post-diagnosis study period.
We demonstrated the feasibility of robustly characterizing a variety of patient-level utilization and health services outcomes in a community-based EHR cohort. This represents a substantial methodological improvement by alleviating traditional constraints on the accuracy and clinical resolution of such ILD cohorts; we believe this approach will make community-based ILD research more efficient, effective, and scalable.
间质性肺病(ILD)的流行病学研究受到数量少和三级护理偏见的限制。研究人员利用电子健康记录(EHR)的广泛使用来克服这些限制,但在提取患者水平、纵向临床数据以解决许多重要研究问题方面仍存在困难。我们假设可以使用大型基于社区的医疗保健系统的 EHR 自动开发纵向 ILD 队列。
我们将先前验证的算法应用于基于社区的医疗保健系统的 EHR 中,以确定 2012 年至 2020 年间的 ILD 病例。然后,我们使用完全自动化的数据提取算法和选定的自由文本的自然语言处理提取疾病特异性特征和结局。
我们确定了一个社区 ILD 患者队列,共有 5399 例(患病率为 118/100,000)。肺功能检查(71%)和血清学检查(54%)常用于诊断评估,而肺活检很少见(5%)。特发性肺纤维化(IPF)是最常见的 ILD 诊断(n = 972,18%)。泼尼松是最常开的药物(n = 911,17%)。尼达尼布和吡非尼酮很少开(n = 305,5%)。ILD 患者住院(每年 40%住院)和门诊(每年 80%有肺部就诊)利用率高,在整个诊断后研究期间持续利用。
我们证明了在基于社区的 EHR 队列中,对各种患者水平的利用和健康服务结局进行强有力描述的可行性。这代表了一种实质性的方法改进,减轻了传统 ILD 队列的准确性和临床分辨率的限制;我们相信这种方法将使基于社区的 ILD 研究更高效、更有效、更具扩展性。