Mori Yuichiro, Fukuma Shingo, Yamaji Kyohei, Mizuno Atsushi, Kondo Naoki, Inoue Kosuke
Department of Human Health Sciences, Graduate School of Medicine, Kyoto University, Kyoto, Japan.
Department of Cardiovascular Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan.
ESC Heart Fail. 2025 Apr;12(2):859-868. doi: 10.1002/ehf2.15056. Epub 2024 Nov 19.
Natriuretic peptide-based pre-heart failure screening has been proposed in recent guidelines. However, an effective strategy to identify screening targets from the general population, more than half of which are at risk for heart failure or pre-heart failure, has not been well established. This study evaluated the performance of machine learning prediction models for predicting elevated N terminal pro brain natriuretic peptide (NT-proBNP) levels in the US general population.
Individuals aged 20-79 years without cardiovascular disease from the nationally representative National Health and Nutrition Examination Survey 1999-2004 were included. Six prediction models (two conventional regression models and four machine learning models) were trained with the 1999-2002 cohort to predict elevated NT-proBNP levels (>125 pg/mL) using demographic, lifestyle, and commonly measured biochemical data. The model performance was tested using the 2003-2004 cohort. Of the 10 237 individuals, 1510 (14.8%) had NT-proBNP levels >125 pg/mL. The highest area under the receiver operating characteristic curve (AUC) was observed in SuperLearner (AUC [95% CI] = 0.862 [0.847-0.878], P < 0.001 compared with the logistic regression model). The logistic regression model with splines showed a comparable performance (AUC [95% CI] = 0.857 [0.841-0.874], P = 0.08). Age, albumin level, haemoglobin level, sex, estimated glomerular filtration rate, and systolic blood pressure were the most important predictors. We found a similar prediction performance even after excluding socio-economic information (marital status, family income, and education status) from the prediction models. When we used different thresholds for elevated NT-proBNP, the AUC (95% CI) in the SuperLearner models 0.846 (0.830-0.861) for NT-proBNP > 100 pg/mL and 0.866 (0.849-0.884) for NT-proBNP > 150 pg/mL.
Using nationally representative data from the United States, both logistic regression and machine learning models well predicted elevated NT-proBNP. The predictive performance remained consistent even when the models incorporated only commonly available variables in daily clinical practice. Prediction models using regularly measured information would serve as a potentially useful tools for clinicians to effectively identify targets of natriuretic-peptide screening.
近期指南中提出了基于利钠肽的心力衰竭前期筛查方法。然而,尚未建立一种从普通人群中识别筛查目标的有效策略,其中超过一半的人有心力衰竭或心力衰竭前期风险。本研究评估了机器学习预测模型在美国普通人群中预测N末端脑钠肽前体(NT-proBNP)水平升高的性能。
纳入了来自具有全国代表性的1999 - 2004年国家健康和营养检查调查中年龄在20 - 79岁且无心血管疾病的个体。使用1999 - 2002年队列训练了六个预测模型(两个传统回归模型和四个机器学习模型),以使用人口统计学、生活方式和常用的生化数据预测NT-proBNP水平升高(>125 pg/mL)。使用2003 - 2004年队列测试模型性能。在10237名个体中,1510名(14.8%)的NT-proBNP水平>125 pg/mL。在超级学习者模型中观察到最高的受试者工作特征曲线下面积(AUC)(AUC [95% CI] = 0.862 [0.847 - 0.878],与逻辑回归模型相比,P < 0.001)。带有样条的逻辑回归模型表现相当(AUC [95% CI] = 0.857 [0.841 - 0.874],P = 0.08)。年龄、白蛋白水平、血红蛋白水平、性别、估计肾小球滤过率和收缩压是最重要的预测因素。即使从预测模型中排除社会经济信息(婚姻状况、家庭收入和教育状况),我们也发现了相似的预测性能。当我们对NT-proBNP升高使用不同阈值时,超级学习者模型中NT-proBNP > 100 pg/mL时的AUC(95% CI)为0.846(0.830 - 0.861),NT-proBNP > 150 pg/mL时为0.866(0.849 - 0.884)。
使用来自美国的具有全国代表性的数据,逻辑回归和机器学习模型都能很好地预测NT-proBNP升高。即使模型仅纳入日常临床实践中常用的变量,预测性能仍然一致。使用定期测量信息的预测模型可能会成为临床医生有效识别利钠肽筛查目标的有用工具。