Department of Medical Informatics, Erasmus MC University Medical Center, Rotterdam, The Netherlands.
BMJ Open. 2013 Jun 20;3(6):e002862. doi: 10.1136/bmjopen-2013-002862.
To evaluate positive predictive value (PPV) of different disease codes and free text in identifying acute myocardial infarction (AMI) from electronic healthcare records (EHRs).
Validation study of cases of AMI identified from general practitioner records and hospital discharge diagnoses using free text and codes from the International Classification of Primary Care (ICPC), International Classification of Diseases 9th revision-clinical modification (ICD9-CM) and ICD-10th revision (ICD-10).
Population-based databases comprising routinely collected data from primary care in Italy and the Netherlands and from secondary care in Denmark from 1996 to 2009.
A total of 4 034 232 individuals with 22 428 883 person-years of follow-up contributed to the data, from which 42 774 potential AMI cases were identified. A random sample of 800 cases was subsequently obtained for validation.
PPVs were calculated overall and for each code/free text. 'Best-case scenario' and 'worst-case scenario' PPVs were calculated, the latter taking into account non-retrievable/non-assessable cases. We further assessed the effects of AMI misclassification on estimates of risk during drug exposure.
Records of 748 cases (93.5% of sample) were retrieved. ICD-10 codes had a 'best-case scenario' PPV of 100% while ICD9-CM codes had a PPV of 96.6% (95% CI 93.2% to 99.9%). ICPC codes had a 'best-case scenario' PPV of 75% (95% CI 67.4% to 82.6%) and free text had PPV ranging from 20% to 60%. Corresponding PPVs in the 'worst-case scenario' all decreased. Use of codes with lower PPV generally resulted in small changes in AMI risk during drug exposure, but codes with higher PPV resulted in attenuation of risk for positive associations.
ICD9-CM and ICD-10 codes have good PPV in identifying AMI from EHRs; strategies are necessary to further optimise utility of ICPC codes and free-text search. Use of specific AMI disease codes in estimation of risk during drug exposure may lead to small but significant changes and at the expense of decreased precision.
评估不同疾病代码和自由文本在电子健康记录(EHR)中识别急性心肌梗死(AMI)的阳性预测值(PPV)。
使用来自初级保健的国际初级保健分类(ICPC)、国际疾病分类第 9 修订版临床修正版(ICD9-CM)和国际疾病分类第 10 修订版(ICD-10)的自由文本和代码,从一般实践记录和医院出院诊断中验证 AMI 病例。
基于人群的数据库,包括意大利和荷兰的初级保健和丹麦的二级保健常规收集的数据,时间跨度为 1996 年至 2009 年。
共有 4034232 人参与了包含 2242883 人年随访的数据,其中 42774 例为潜在 AMI 病例。随后随机抽取 800 例进行验证。
计算了总体和每种代码/自由文本的 PPV。计算了“最佳情况”和“最差情况”PPV,后者考虑了不可检索/不可评估的病例。我们进一步评估了 AMI 分类错误对药物暴露期间风险估计的影响。
748 例(样本的 93.5%)记录被检索到。ICD-10 代码的“最佳情况”PPV 为 100%,而 ICD9-CM 代码的 PPV 为 96.6%(95%CI 93.2%至 99.9%)。ICPC 代码的“最佳情况”PPV 为 75%(95%CI 67.4%至 82.6%),自由文本的 PPV 范围为 20%至 60%。“最差情况”的相应 PPV 均下降。使用 PPV 较低的代码通常会导致药物暴露期间 AMI 风险的微小变化,但使用 PPV 较高的代码会导致阳性关联的风险减弱。
ICD9-CM 和 ICD-10 代码在从 EHR 中识别 AMI 方面具有良好的 PPV;需要采取策略进一步优化 ICPC 代码和自由文本搜索的效用。在药物暴露期间估计风险时使用特定的 AMI 疾病代码可能会导致微小但显著的变化,同时降低精度。