基于电子病历的深度数据清理与表型分析可提高感染性心内膜炎的诊断准确性及死亡率评估:中国医药大学附设医院的医学大数据计划
Electronic medical record-based deep data cleaning and phenotyping improve the diagnostic validity and mortality assessment of infective endocarditis: medical big data initiative of CMUH.
作者信息
Chiang Hsiu-Yin, Liang Li-Ying, Lin Che-Chen, Chen Yi-Jin, Wu Min-Yen, Chen Sheng-Hsuan, Wu Pin-Hua, Kuo Chin-Chi, Chi Chih-Yu
机构信息
Big Data Center, China Medical University Hospital, Taichung, Taiwan.
Division of Infectious Diseases, Department of Internal Medicine, China Medical University Hospital, Taichung, Taiwan.
出版信息
Biomedicine (Taipei). 2021 Sep 1;11(3):59-67. doi: 10.37796/2211-8039.1267. eCollection 2021.
BACKGROUND
International Classification of Diseases (ICD) code-based claims databases are often used to study infective endocarditis (IE). However, the quality of ICD coding can influence the reliability of IE research. The impact of complementing the ICD-only approach with data extracted from electronic medical records (EMRs) has yet to be explored.
METHODS
We selected the information of adult patients with discharge ICD codes for IE (ICD-9: 421, 112.81, 036.42, 098.84, 115.04, 115.14, 115.94, 424.9; ICD-10: I33, I38, I39) during 2005-2016 in China Medical University Hospital. Data extraction was conducted on the basis of the modified Duke criteria to establish a reference group comprising patients with definite or possible IE. Clinical characteristics and in-hospital mortality were compared between ICD-identified and Duke-confirmed cases. The positive predictive value (PPV) was used to quantify the IE identification performance of various phenotyping algorithms.
RESULTS
A total of 593 patients with discharge ICD codes for IE were identified, only 56.7% met the modified Duke criteria. The crude in-hospital mortality for Duke-confirmed and Duke-rejected IE were 24.4% and 8.2%, respectively. The adjusted in-hospital mortality for ICD-identified IE was lower than that for Duke-confirmed IE by a difference of 5.1%. The best PPV was achieved (0.90, 95% CI 0.86-0.93) when major components of the Duke criteria (positive blood culture and vegetation) were integrated with ICD codes.
CONCLUSION
Integrating EMR data can considerably improve the accuracy of ICD-only approaches in phenotyping IE, which can improve the validity of EMR-based studies and their applications, including real-time surveillance and clinical decision support.
背景
基于国际疾病分类(ICD)编码的索赔数据库常用于研究感染性心内膜炎(IE)。然而,ICD编码的质量会影响IE研究的可靠性。用从电子病历(EMR)中提取的数据补充仅使用ICD的方法所产生的影响尚未得到探索。
方法
我们选取了2005年至2016年在中国医科大学附属医院出院ICD编码为IE(ICD-9:421、112.81、036.42、098.84、115.04、115.14、115.94、424.9;ICD-10:I33、I38、I39)的成年患者信息。根据改良的杜克标准进行数据提取,以建立一个包括确诊或可能患有IE的患者的参考组。比较ICD识别病例和杜克确诊病例的临床特征及住院死亡率。使用阳性预测值(PPV)来量化各种表型分析算法的IE识别性能。
结果
共识别出593例出院ICD编码为IE的患者,只有56.7%符合改良的杜克标准。杜克确诊和杜克排除IE的粗住院死亡率分别为24.4%和8.2%。ICD识别的IE调整后的住院死亡率比杜克确诊的IE低5.1%。当将杜克标准的主要组成部分(血培养阳性和赘生物)与ICD编码相结合时,可实现最佳PPV(0.90,95%CI 0.86-0.93)。
结论
整合EMR数据可显著提高仅使用ICD方法在IE表型分析中的准确性,这可提高基于EMR的研究及其应用的有效性,包括实时监测和临床决策支持。