Wong Carrie R, Flores Yvonne N, Avila Analissa, Tieu Lina, Crespi Catherine M, May Folasade P, Bell Douglas, Glenn Beth, Bastani Roshan
Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, University of California, Los Angeles.
UCLA Center for Cancer Prevention and Control and UCLA-Kaiser Permanente Center for Health Equity.
Res Sq. 2024 Oct 18:rs.3.rs-4993106. doi: 10.21203/rs.3.rs-4993106/v1.
We assessed the performance of ICD codes to identify patients with hepatocellular carcinoma (HCC) in a large academic health system and determined whether employing an algorithm using a combination of ICD codes could deliver higher accuracy and precision than single ICD codes in identifying HCC cases using electronic health record (EHR) data.
The use of a single ICD code entry for HCC (ICD-9-CM 155.0 or ICD-10-CM C22.0) in our cohort of 1,007 established ambulatory care patients with potential HCC yielded 58% false positives (not true HCC cases) based on chart reviews. We developed an ICD code-based algorithm that prioritized positive predictive value (PPV), F-score, and accuracy to minimize false positives and negatives. The highest performing algorithm required at least 10 ICD code entries for HCC and the sum of ICD code entries for HCC to exceed the sum of ICD code entries for non-HCC malignancies. The algorithm demonstrated high performance (PPV 97.4%, F-score 0.92, accuracy 94%), which was internally validated (PPV 92.3%, F-score 0.90, accuracy 91%) using a separate sample of potential HCC cases. Our findings support the need to assess the accuracy and precision of ICD codes before using EHR data to study HCC more broadly.
我们评估了国际疾病分类(ICD)编码在一个大型学术医疗系统中识别肝细胞癌(HCC)患者的性能,并确定使用结合ICD编码的算法在利用电子健康记录(EHR)数据识别HCC病例时是否能比单一ICD编码提供更高的准确性和精确性。
在我们1007名确诊的潜在HCC门诊患者队列中,使用单一HCC的ICD编码条目(ICD-9-CM 155.0或ICD-10-CM C22.0),根据病历审查得出假阳性率(非真正HCC病例)为58%。我们开发了一种基于ICD编码的算法,该算法优先考虑阳性预测值(PPV)、F值和准确性,以尽量减少假阳性和假阴性。性能最佳的算法要求至少有10个HCC的ICD编码条目,且HCC的ICD编码条目总和超过非HCC恶性肿瘤的ICD编码条目总和。该算法表现出高性能(PPV 97.4%,F值0.92,准确性94%),并使用另一组潜在HCC病例样本进行了内部验证(PPV 92.3%,F值0.90,准确性91%)。我们的研究结果支持在更广泛地使用EHR数据研究HCC之前,需要评估ICD编码的准确性和精确性。