Suppr超能文献

使用机器学习技术预测高胆固醇血症。

Prediction of hypercholesterolemia using machine learning techniques.

作者信息

Moradifar Pooyan, Amiri Mohammad Meskarpour

机构信息

Independent researcher, Tehran, Iran.

Health Management Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran.

出版信息

J Diabetes Metab Disord. 2022 Dec 22;22(1):255-265. doi: 10.1007/s40200-022-01125-w. eCollection 2023 Jun.

Abstract

PURPOSE

Hypercholesterolemia is a major risk factor for a wide range of cardiovascular diseases. Developing countries are more susceptible to hypercholesterolemia and its complications due to the increasing prevalence and the lack of adequate resources for conducting screening and/or prevention programs. Using machine learning techniques to identify factors contributing to hypercholesterolemia and developing predictive models can help early detection of hypercholesterolemia, especially in developing countries.

METHODS

Data from the nationwide 2016 STEPs study in Iran were used to identify socioeconomic, lifestyle, and metabolic risk factors associated with hypercholesterolemia. Furthermore, the predictive power of the identified risk factors was assessed using five commonly used machine learning algorithms (random forest; gradient boosting; support vector machine; logistic regression; artificial neural network) and 10-fold cross validation in terms of specificity, sensitivity, and the area under the receiver operating characteristic curve.

RESULTS

A total of 14,667 individuals were included in this study, of those 12.8% ( = 1879) had (undiagnosed) hypercholesterolemia. Based on multivariate logistic regression analysis the five most important risk factors for hypercholesterolemia were: older age (for the elderly group: OR = 2.243; for the middle-aged group: OR = 1.869), obesity-related factors including high BMI status (morbidly obese: OR = 1.884; obese: OR = 1.499; overweight: OR = 1.426) and AO (OR = 1.339), raised BP (hypertension: OR = 1.729; prehypertension: OR = 1.577), consuming fish once or twice per week (OR = 1.261), and having risky diet (OR = 1.163). Furthermore, all the five hypercholesterolemia prediction models achieved AUC around 0.62, and models based on random forest (AUC = 0.6282; specificity = 65.14%; sensitivity = 60.51%) and gradient boosting (AUC = 0.6263; specificity = 64.11%; sensitivity = 61.15%) had the optimal performance.

CONCLUSION

The study shows that socioeconomic inequalities, unhealthy lifestyle, and metabolic syndrome (including obesity and hypertension) are significant predictors of hypercholesterolemia. Therefore controlling these factors is necessary to reduce the burden of hypercholesterolemia. Furthermore, machine learning algorithms such as random forest and gradient boosting can be employed for hypercholesterolemia screening and its timely diagnosis. Applying deep learning algorithms as well as techniques for handling the class overlap problem seems necessary to improve the performance of the models.

摘要

目的

高胆固醇血症是多种心血管疾病的主要危险因素。由于患病率上升且缺乏开展筛查和/或预防项目的充足资源,发展中国家更容易受到高胆固醇血症及其并发症的影响。使用机器学习技术识别导致高胆固醇血症的因素并开发预测模型有助于早期发现高胆固醇血症,尤其是在发展中国家。

方法

利用伊朗2016年全国性的STEP研究数据来识别与高胆固醇血症相关的社会经济、生活方式和代谢危险因素。此外,使用五种常用的机器学习算法(随机森林;梯度提升;支持向量机;逻辑回归;人工神经网络)和10倍交叉验证,从特异性、敏感性和受试者工作特征曲线下面积方面评估所识别危险因素的预测能力。

结果

本研究共纳入14667人,其中12.8%(n = 1879)患有(未确诊的)高胆固醇血症。基于多变量逻辑回归分析,高胆固醇血症的五个最重要危险因素为:年龄较大(老年组:OR = 2.243;中年组:OR = 1.869),肥胖相关因素包括高BMI状态(病态肥胖:OR = 1.884;肥胖:OR = 1.499;超重:OR = 1.426)和AO(OR = 1.339),血压升高(高血压:OR = 1.729;高血压前期:OR = 1.577),每周食用鱼类一到两次(OR = 1.261),以及饮食不健康(OR = 1.163)。此外,所有五个高胆固醇血症预测模型的AUC均约为0.62,基于随机森林(AUC = 0.6282;特异性 = 65.14%;敏感性 = 60.51%)和梯度提升(AUC = 0.6263;特异性 = 64.11%;敏感性 = 61.15%)的模型具有最佳性能。

结论

该研究表明社会经济不平等、不健康的生活方式和代谢综合征(包括肥胖和高血压)是高胆固醇血症的重要预测因素。因此,控制这些因素对于减轻高胆固醇血症的负担是必要的。此外,随机森林和梯度提升等机器学习算法可用于高胆固醇血症的筛查及其及时诊断。应用深度学习算法以及处理类别重叠问题的技术似乎对于提高模型性能是必要的。

相似文献

1
Prediction of hypercholesterolemia using machine learning techniques.使用机器学习技术预测高胆固醇血症。
J Diabetes Metab Disord. 2022 Dec 22;22(1):255-265. doi: 10.1007/s40200-022-01125-w. eCollection 2023 Jun.

引用本文的文献

1
A stacked ensemble machine learning approach for the prediction of diabetes.一种用于预测糖尿病的堆叠集成机器学习方法。
J Diabetes Metab Disord. 2023 Nov 22;23(1):603-617. doi: 10.1007/s40200-023-01321-2. eCollection 2024 Jun.

本文引用的文献

3
Global epidemiology of dyslipidaemias.血脂异常的全球流行病学。
Nat Rev Cardiol. 2021 Oct;18(10):689-700. doi: 10.1038/s41569-021-00541-4. Epub 2021 Apr 8.
4
Prediction of Type 2 Diabetes Based on Machine Learning Algorithm.基于机器学习算法的 2 型糖尿病预测。
Int J Environ Res Public Health. 2021 Mar 23;18(6):3317. doi: 10.3390/ijerph18063317.
10
Repositioning of the global epicentre of non-optimal cholesterol.非最佳胆固醇的全球中心位置的重新定位。
Nature. 2020 Jun;582(7810):73-77. doi: 10.1038/s41586-020-2338-1. Epub 2020 Jun 3.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验