Ye Chengyin, Fu Tianyun, Hao Shiying, Zhang Yan, Wang Oliver, Jin Bo, Xia Minjie, Liu Modi, Zhou Xin, Wu Qian, Guo Yanting, Zhu Chunqing, Li Yu-Ming, Culver Devore S, Alfreds Shaun T, Stearns Frank, Sylvester Karl G, Widen Eric, McElhinney Doff, Ling Xuefeng
Department of Health Management, Hangzhou Normal University, Hangzhou, China.
Department of Surgery, Stanford University, Stanford, CA, United States.
J Med Internet Res. 2018 Jan 30;20(1):e22. doi: 10.2196/jmir.9268.
As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke.
The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year.
Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual.
The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension.
With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.
作为一种高发性健康状况,高血压在临床上成本高昂,难以管理,并且常常导致严重的危及生命的疾病,如心血管疾病(CVD)和中风。
本研究的目的是前瞻性地开发并验证下一年新发原发性高血压的风险预测模型。
从缅因州健康信息交换网络中提取个体患者电子健康记录(EHR)的数据。形成了回顾性队列(N = 823,627,2013年日历年)和前瞻性队列(N = 680,810,2014年日历年)。在特征选择和模型构建过程中采用了一种机器学习算法XGBoost。它生成了一组分类树,并为每个个体分配了最终的预测风险评分。
1年新发高血压风险模型在回顾性队列和前瞻性队列中的曲线下面积(AUC)分别为0.917和0.870。计算风险评分并将其分为五个风险类别,在最低风险类别(评分0 - 0.05)的381,544名患者中有4526名(1.19%),在最高风险类别(评分0.4 - 1)的41,329名患者中有21,050名(50.93%)在接下来的1年中被诊断为新发高血压。2型糖尿病、脂质紊乱、心血管疾病、精神疾病、临床使用指标和社会经济决定因素被认为是新发原发性高血压的驱动因素或相关特征。极高风险人群主要包括患有多种慢性病的老年人(年龄>50岁),尤其是那些正在接受精神障碍药物治疗的人。在社会决定因素方面也发现了差异,包括一些与较高风险相关的社区层面因素以及其他对高血压有保护作用的因素。
利用全州范围的电子健康记录数据集,我们的研究前瞻性地验证了一个准确的1年新发原发性高血压风险预测模型。我们的实时预测分析模型已在缅因州部署,为高血压及相关疾病的干预措施提供了启示,并有望改善高血压护理。