Department of Emergency Medicine, United States; Yale University School of Medicine, New Haven CT, United States.
Department of Internal Medicine, United States; Yale University School of Medicine, New Haven CT, United States.
J Stroke Cerebrovasc Dis. 2020 Dec;29(12):105306. doi: 10.1016/j.jstrokecerebrovasdis.2020.105306. Epub 2020 Oct 15.
Nontraumatic intracranial hemorrhage (ICH) is a neurological emergency of research interest; however, unlike ischemic stroke, has not been well studied in large datasets due to the lack of an established administrative claims-based definition. We aimed to evaluate both explicit diagnosis codes and machine learning methods to create a claims-based definition for this clinical phenotype.
We examined all patients admitted to our tertiary medical center with a primary or secondary International Classification of Disease version 9 (ICD-9) or 10 (ICD-10) code for ICH in claims from any portion of the hospitalization in 2014-2015. As a gold standard, we defined the nontraumatic ICH phenotype based on manual chart review. We tested explicit definitions based on ICD-9 and ICD-10 that had been previously published in the literature as well as four machine learning classifiers including support vector machine (SVM), logistic regression with LASSO, random forest and xgboost. We report five standard measures of model performance for each approach.
A total of 1830 patients with 2145 unique ICD-10 codes were included in the initial dataset, of which 437 (24%) were true positive based on manual review. The explicit ICD-10 definition performed best (Sensitivity = 0.89 (95% CI 0.85-0.92), Specificity = 0.83 (0.81-0.85), F-score = 0.73 (0.69-0.77)) and improves on an explicit ICD-9 definition (Sensitivity = 0.87 (0.83-0.90), Specificity = 0.77 (0.74-0.79), F-score = 0.67 (0.63-0.71). Among machine learning classifiers, SVM performed best (Sensitivity = 0.78 (0.75-0.82), Specificity = 0.84 (0.81-0.87), AUC = 0.89 (0.87-0.92), F-score = 0.66 (0.62-0.69)).
An explicit ICD-10 definition can be used to accurately identify patients with a nontraumatic ICH phenotype with substantially better performance than ICD-9. An explicit ICD-10 based definition is easier to implement and quantitatively not appreciably improved with the additional application of machine learning classifiers. Future research utilizing large datasets should utilize this definition to address important research gaps.
非外伤性颅内出血(ICH)是一个研究热点的神经急症;然而,与缺血性中风不同,由于缺乏既定的基于行政索赔的定义,因此在大型数据集方面研究得并不充分。我们旨在评估明确的诊断代码和机器学习方法,为这一临床表型创建基于索赔的定义。
我们检查了 2014 年至 2015 年期间在我们的三级医疗中心住院期间任何部分的住院记录中,基于国际疾病分类第 9 版(ICD-9)或第 10 版(ICD-10)的主要或次要代码,对所有患有 ICH 的患者进行了分析。作为金标准,我们根据手动图表审查来定义非外伤性 ICH 表型。我们测试了以前在文献中发表的基于 ICD-9 和 ICD-10 的明确定义,以及包括支持向量机(SVM)、带 LASSO 的逻辑回归、随机森林和 xgboost 在内的四种机器学习分类器。我们报告了每种方法的五个标准模型性能指标。
总共纳入了 1830 名患者的 2145 个独特的 ICD-10 编码,其中 437 名(24%)基于手动审查为真阳性。明确的 ICD-10 定义表现最佳(灵敏度为 0.89(95%CI 0.85-0.92),特异性为 0.83(0.81-0.85),F 分数为 0.73(0.69-0.77)),优于明确的 ICD-9 定义(灵敏度为 0.87(0.83-0.90),特异性为 0.77(0.74-0.79),F 分数为 0.67(0.63-0.71))。在机器学习分类器中,SVM 表现最佳(灵敏度为 0.78(0.75-0.82),特异性为 0.84(0.81-0.87),AUC 为 0.89(0.87-0.92),F 分数为 0.66(0.62-0.69))。
明确的 ICD-10 定义可用于准确识别非外伤性 ICH 表型患者,其性能明显优于 ICD-9。基于明确的 ICD-10 的定义更容易实施,并且通过应用机器学习分类器并没有显著提高定量性能。未来利用大型数据集的研究应利用这一定义来解决重要的研究空白。