利用自动化健康计划数据评估算法以识别乳腺癌复发。

Evaluation of Algorithms Using Automated Health Plan Data to Identify Breast Cancer Recurrences.

机构信息

Kaiser Permanente Washington Health Research Institute, Kaiser Permanente Washington, Seattle, Washington.

Division of Research, Kaiser Permanente Northern California, Oakland, California.

出版信息

Cancer Epidemiol Biomarkers Prev. 2024 Mar 1;33(3):355-364. doi: 10.1158/1055-9965.EPI-23-0782.

DOI:10.1158/1055-9965.EPI-23-0782

PMID:38088912

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10922110/

Abstract

BACKGROUND

We updated algorithms to identify breast cancer recurrences from administrative data, extending previously developed methods.

METHODS

In this validation study, we evaluated pairs of breast cancer recurrence algorithms (vs. individual algorithms) to identify recurrences. We generated algorithm combinations that categorized discordant algorithm results as no recurrence [High Specificity and PPV (positive predictive value) Combination] or recurrence (High Sensitivity Combination). We compared individual and combined algorithm results to manually abstracted recurrence outcomes from a sample of 600 people with incident stage I-IIIA breast cancer diagnosed between 2004 and 2015. We used Cox regression to evaluate risk factors associated with age- and stage-adjusted recurrence rates using different recurrence definitions, weighted by inverse sampling probabilities.

RESULTS

Among 600 people, we identified 117 recurrences using the High Specificity and PPV Combination, 505 using the High Sensitivity Combination, and 118 using manual abstraction. The High Specificity and PPV Combination had good specificity [98%, 95% confidence interval (CI): 97-99] and PPV (72%, 95% CI: 63-80) but modest sensitivity (64%, 95% CI: 44-80). The High Sensitivity Combination had good sensitivity (80%, 95% CI: 49-94) and specificity (83%, 95% CI: 80-86) but low PPV (29%, 95% CI: 25-34). Recurrence rates using combined algorithms were similar in magnitude for most risk factors.

CONCLUSIONS

By combining algorithms, we identified breast cancer recurrences with greater PPV than individual algorithms, without additional review of discordant records.

IMPACT

Researchers should consider tradeoffs between accuracy and manual chart abstraction resources when using previously developed algorithms. We provided guidance for future studies that use breast cancer recurrence algorithms with or without supplemental manual chart abstraction.

摘要

背景

我们更新了从管理数据中识别乳腺癌复发的算法，扩展了以前开发的方法。

方法

在这项验证研究中，我们评估了一对乳腺癌复发算法（与单个算法相比）以识别复发。我们生成了将不一致算法结果分类为无复发[高特异性和阳性预测值（PPV）组合]或复发[高灵敏度组合]的算法组合。我们将单个和组合算法结果与从 2004 年至 2015 年间诊断为 I 期至 IIIA 期乳腺癌的 600 名患者的样本中手动摘录的复发结果进行比较。我们使用 Cox 回归来评估使用不同的复发定义、根据逆抽样概率加权后的与年龄和阶段调整后的复发率相关的风险因素。

结果

在 600 名患者中，我们使用高特异性和 PPV 组合识别了 117 例复发，使用高灵敏度组合识别了 505 例复发，使用手动摘录识别了 118 例复发。高特异性和 PPV 组合具有良好的特异性[98%，95%置信区间（CI）：97-99]和 PPV（72%，95%CI：63-80），但敏感性适中（64%，95%CI：44-80）。高灵敏度组合具有良好的敏感性（80%，95%CI：49-94）和特异性（83%，95%CI：80-86），但 PPV 较低（29%，95%CI：25-34）。对于大多数风险因素，使用组合算法的复发率大小相似。