Pulmonary and Critical Care Medicine, Johns Hopkins University, Baltimore, Maryland, USA
Asymmetric Operations Sector, Johns Hopkins University Applied Physics Laboratory, Laurel, Maryland, USA.
BMJ Open Respir Res. 2024 Oct 3;11(1):e001896. doi: 10.1136/bmjresp-2023-001896.
In chronic obstructive pulmonary disease (COPD), accurately estimating lung function from electronic health record (EHR) data would be beneficial but requires addressing complexities in clinically obtained testing. This study compared analytic methods for estimating rate of forced expiratory volume in one second (FEV) change from EHR data.
We estimated rate of FEV change in patients with COPD from a single centre who had ≥3 outpatient tests spanning at least 1 year. Estimates were calculated as both an absolute mL/year and a relative %/year using non-regressive (Total Change, Average Change) and regressive (Quantile, RANSAC, Huber) methods. We compared distributions of the estimates across methods focusing on extreme values. Univariate zero-inflated negative binomial regressions tested associations between estimates and all-cause or COPD hospitalisations. Results were validated in an external cohort.
Among 1417 participants, median rate of change was approximately -30 mL/year or -2%/year. Non-regressive methods frequently generated erroneous estimates due to outlier first measurements or short intervals between tests. Average change yielded the most extreme estimates (minimum=-3761 mL/year), while regressive methods, and Huber specifically, minimised extreme estimates. Huber, Total Change and Quantile FEV slope estimates were associated with all-cause hospitalisations (Huber incidence rate ratio 0.98, 95% CI 0.97 to 0.99, p<0.001). Huber estimates were also associated with smoking status, comorbidities and prior hospitalisations. Similar results were identified in an external validation cohort.
Using EHR data to estimate FEV rate of change is clinically applicable but sensitive to challenges intrinsic to clinically obtained data. While no analytic method will fully overcome these complexities, we identified Huber regression as useful in defining an individual's lung function change using EHR data.
在慢性阻塞性肺疾病(COPD)中,从电子健康记录(EHR)数据中准确估计肺功能将是有益的,但需要解决临床获得的测试中的复杂性。本研究比较了从至少有 3 次门诊测试、时间跨度至少 1 年的 COPD 患者的 EHR 数据中估计 1 秒用力呼气量(FEV)变化率的分析方法。
我们从一个单一的中心估计了 COPD 患者的 FEV 变化率,这些患者至少有 3 次门诊测试,时间跨度至少为 1 年。使用非回归(总变化、平均变化)和回归(分位数、RANsac、Huber)方法,将估计值分别计算为 mL/年的绝对值和 %/年的相对值。我们比较了不同方法的估计值的分布,重点是极端值。单变量零膨胀负二项回归检验了估计值与全因或 COPD 住院之间的关系。结果在外部队列中得到了验证。
在 1417 名参与者中,变化率的中位数约为-30mL/年或-2%/年。非回归方法由于第一个测量值的异常值或测试之间的间隔较短,经常产生错误的估计值。平均变化产生了最极端的估计值(最小值=-3761mL/年),而回归方法,特别是 Huber,最小化了极端估计值。Huber、总变化和分位数 FEV 斜率估计值与全因住院有关(Huber 发病率比 0.98,95%CI 0.97 至 0.99,p<0.001)。Huber 估计值也与吸烟状况、合并症和既往住院有关。在外部验证队列中也发现了类似的结果。
使用 EHR 数据估计 FEV 变化率在临床上是可行的,但对临床获得的数据中的固有挑战很敏感。虽然没有任何分析方法能够完全克服这些复杂性,但我们发现 Huber 回归在使用 EHR 数据定义个体的肺功能变化方面是有用的。