Department of Biostatistics School of Public Health, Cheeloo College of Medicine, Shandong University Jinan Shandong China.
National Institute of Health Data Science of China Jinan Shandong China.
J Am Heart Assoc. 2024 Jan 2;13(1):e029400. doi: 10.1161/JAHA.123.029400. Epub 2023 Dec 29.
BACKGROUND: Traditional risk evaluation models have been applied to guide public health and clinical practice in various studies. However, the application of existing methods to data sets with missing and censored data, as is often the case in electronic health records, requires additional considerations. We aimed to develop and validate a predictive model that exhibits high performance with data sets that contain missing and censored data. METHODS AND RESULTS: This is a retrospective cohort study of coronary heart disease at Weihai Municipal Hospital on unique patients aged 18 to 96 years between 2013 and 2021. A total of 169 692 participants formed our study population, of which 10 895 participants were diagnosed with coronary heart disease. Models were built for the risk of coronary heart disease based on demographic, laboratory, and medical history variables. All complete samples were assigned to the training set (n=110 325), whereas the remaining samples were assigned to the validation set (n=59 367). The area under the receiver operating characteristic curve value was 0.800 (95% CI, 0.794-0.805), and the C statistic was 0.796 (95% CI, 0.791-0.801) in the derivation cohort, and the corresponding values were 0.837 (95% CI, 0.821-0.853) and 0.838 (95% CI, 0.822-0.854) in the validation cohort. The calibration curve demonstrated its good calibration ability, and decision curve analysis showed its clinical usefulness. CONCLUSIONS: Our proposed risk prediction model has demonstrated significant effectiveness in handling the complexities of electronic health record data, which often involve extensive missing data and censoring. This approach may offer potential assistance in the use of electronic health records to enhance patient outcomes.
背景:传统的风险评估模型已被应用于各种研究中,以指导公共卫生和临床实践。然而,在电子健康记录中经常出现的包含缺失和删失数据的数据集中应用现有方法需要额外的考虑。我们旨在开发和验证一种在包含缺失和删失数据的数据集中表现出高性能的预测模型。
方法和结果:这是一项对威海市立医院 2013 年至 2021 年间年龄在 18 至 96 岁的独特患者的冠心病的回顾性队列研究。共有 169692 名参与者构成了我们的研究人群,其中 10895 名参与者被诊断为冠心病。基于人口统计学、实验室和病史变量,为冠心病风险构建了模型。所有完整样本均被分配到训练集(n=110325),而其余样本则被分配到验证集(n=59367)。在推导队列中,受试者工作特征曲线下面积值为 0.800(95%置信区间,0.794-0.805),C 统计量为 0.796(95%置信区间,0.791-0.801),在验证队列中,相应的值分别为 0.837(95%置信区间,0.821-0.853)和 0.838(95%置信区间,0.822-0.854)。校准曲线表明其具有良好的校准能力,决策曲线分析表明其具有临床实用性。
结论:我们提出的风险预测模型在处理电子健康记录数据的复杂性方面表现出显著的效果,这些数据通常涉及广泛的缺失数据和删失。这种方法可能有助于利用电子健康记录来改善患者的结局。
JACC Clin Electrophysiol. 2019-10-2
Sichuan Da Xue Xue Bao Yi Xue Ban. 2024-5-20
J Am Coll Cardiol. 2020-12-22
Diagn Progn Res. 2019-10-4
Value Health. 2019-3-15
Eur J Prev Cardiol. 2013-12-11