Center for Research and Evaluation, Kaiser Permanente Georgia, Atlanta, Georgia, USA.
Institute for Health Research, Kaiser Permanente Colorado, Aurora, Colorado, USA.
J Am Med Inform Assoc. 2022 Jun 14;29(7):1217-1224. doi: 10.1093/jamia/ocac044.
Tumor registries in integrated healthcare systems (IHCS) have high precision for identifying incident cancer but often miss recently diagnosed cancers or those diagnosed outside of the IHCS. We developed an algorithm using the electronic medical record (EMR) to identify people with a history of cancer not captured in the tumor registry to identify adults, aged 40-65 years, with no history of cancer.
The algorithm was developed at Kaiser Permanente Colorado, and then applied to 7 other IHCS. We included tumor registry data, diagnosis and procedure codes, chemotherapy files, oncology encounters, and revenue data to develop the algorithm. Each IHCS adapted the algorithm to their EMR data and calculated sensitivity and specificity to evaluate the algorithm's performance after iterative chart review.
We included data from over 1.26 million eligible people across 8 IHCS; 55 601 (4.4%) were in a tumor registry, and 44848 (3.5%) had a reported cancer not captured in a registry. The common attributes of the final algorithm at each site were diagnosis and procedure codes. The sensitivity of the algorithm at each IHCS was 90.65%-100%, and the specificity was 87.91%-100%.
Relying only on tumor registry data would miss nearly half of the identified cancers. Our algorithm was robust and required only minor modifications to adapt to other EMR systems.
This algorithm can identify cancer cases regardless of when the diagnosis occurred and may be useful for a variety of research applications or quality improvement projects around cancer care.
综合医疗系统(IHCS)中的肿瘤登记处对于识别新发癌症具有很高的准确性,但往往会错过最近诊断的癌症或在 IHCS 之外诊断的癌症。我们开发了一种使用电子病历(EMR)的算法,以识别未被肿瘤登记处捕捉到的癌症史患者,以确定年龄在 40-65 岁之间、无癌症史的成年人。
该算法由 Kaiser Permanente Colorado 开发,然后应用于其他 7 个 IHCS。我们纳入了肿瘤登记处数据、诊断和程序代码、化疗文件、肿瘤学就诊和收入数据来开发该算法。每个 IHCS 都根据其 EMR 数据对算法进行了调整,并计算了敏感性和特异性,以通过迭代图表审查来评估算法的性能。
我们纳入了来自 8 个 IHCS 的超过 126 万名符合条件的人群的数据;55601 人(4.4%)在肿瘤登记处,44848 人(3.5%)有报告但未被登记处捕捉到的癌症。每个站点最终算法的常见属性是诊断和程序代码。该算法在每个 IHCS 的敏感性为 90.65%-100%,特异性为 87.91%-100%。
仅依赖肿瘤登记处的数据将错过近一半的已识别癌症。我们的算法具有很强的鲁棒性,仅需进行微小修改即可适应其他 EMR 系统。
该算法可以识别无论何时发生的癌症病例,可能对癌症护理的各种研究应用或质量改进项目有用。