利用来自日本的大规模真实世界数据开发和验证缺血性心脏病和中风预后模型。
Development and validation of ischemic heart disease and stroke prognostic models using large-scale real-world data from Japan.
机构信息
Data Science and Advanced Analytics, IQVIA Solutions Japan K.K.
Real-World Evidence Solutions, IQVIA Solutions Japan K.K.
出版信息
Environ Health Prev Med. 2023;28:16. doi: 10.1265/ehpm.22-00106.
BACKGROUND
Previous cardiovascular risk prediction models in Japan have utilized prospective cohort studies with concise data. As the health information including health check-up records and administrative claims becomes digitalized and publicly available, application of large datasets based on such real-world data can achieve prediction accuracy and support social implementation of cardiovascular disease risk prediction models in preventive and clinical practice. In this study, classical regression and machine learning methods were explored to develop ischemic heart disease (IHD) and stroke prognostic models using real-world data.
METHODS
IQVIA Japan Claims Database was searched to include 691,160 individuals (predominantly corporate employees and their families working in secondary and tertiary industries) with at least one annual health check-up record during the identification period (April 2013-December 2018). The primary outcome of the study was the first recorded IHD or stroke event. Predictors were annual health check-up records at the index year-month, comprising demographic characteristics, laboratory tests, and questionnaire features. Four prediction models (Cox, Elnet-Cox, XGBoost, and Ensemble) were assessed in the present study to develop a cardiovascular disease risk prediction model for Japan.
RESULTS
The analysis cohort consisted of 572,971 invididuals. All prediction models showed similarly good performance. The Harrell's C-index was close to 0.9 for all IHD models, and above 0.7 for stroke models. In IHD models, age, sex, high-density lipoprotein, low-density lipoprotein, cholesterol, and systolic blood pressure had higher importance, while in stroke models systolic blood pressure and age had higher importance.
CONCLUSION
Our study analyzed classical regression and machine learning algorithms to develop cardiovascular disease risk prediction models for IHD and stroke in Japan that can be applied to practical use in a large population with predictive accuracy.
背景
日本之前的心血管风险预测模型利用了前瞻性队列研究,数据简洁。随着健康信息(包括健康检查记录和行政索赔)的数字化和公开化,应用基于真实世界数据的大型数据集可以实现预测准确性,并支持心血管疾病风险预测模型在预防和临床实践中的社会实施。在这项研究中,探索了经典回归和机器学习方法,使用真实世界数据开发了缺血性心脏病(IHD)和中风预后模型。
方法
在研究期间(2013 年 4 月至 2018 年 12 月),从 IQVIA 日本索赔数据库中搜索了至少有一次年度健康检查记录的 691,160 人(主要是从事第二和第三产业的企业员工及其家属)。该研究的主要结果是首次记录的 IHD 或中风事件。预测因子是索引年月的年度健康检查记录,包括人口统计学特征、实验室检查和问卷调查特征。本研究评估了四种预测模型(Cox、Elnet-Cox、XGBoost 和 Ensemble),以开发用于日本的心血管疾病风险预测模型。
结果
分析队列包括 572,971 名个体。所有预测模型的表现都相当好。所有 IHD 模型的 Harrell's C 指数接近 0.9,中风模型的指数高于 0.7。在 IHD 模型中,年龄、性别、高密度脂蛋白、低密度脂蛋白、胆固醇和收缩压具有更高的重要性,而在中风模型中,收缩压和年龄具有更高的重要性。
结论
我们的研究分析了经典回归和机器学习算法,为日本的 IHD 和中风开发了心血管疾病风险预测模型,这些模型可以在具有预测准确性的大型人群中应用于实际用途。