使用合成少数过采样技术对围手术期压力性损伤及其影响因素进行不均衡样本研究。

Investigating perioperative pressure injuries and factors influencing them with imbalanced samples using a Synthetic Minority Over-sampling Technique.

作者信息

Zhou Yiwei, Wu Jian, Xu Xin, Shi Guirong, Liu Ping, Jiang Liping

机构信息

Business School, University of Shanghai for Science and Technology, Shanghai, China.

School of Intelligent Emergency Management, University of Shanghai for Science and Technology, Shanghai, China.

出版信息

Biosci Trends. 2025 May 9;19(2):173-188. doi: 10.5582/bst.2025.01013. Epub 2025 Apr 15.

DOI:10.5582/bst.2025.01013

PMID:40240165

Abstract

This study investigates the use of machine learning (ML) models combined with a Synthetic Minority Over-sampling Technique (SMOTE) and its variants to predict perioperative pressure injuries (PIs) in an imbalanced dataset. PIs are a significant healthcare problem, often leading to prolonged hospitalization and increased medical costs. Conventional risk assessment scales are limited in their ability to predict PIs accurately, prompting the exploration of ML techniques to address this challenge.We utilized data from 7,292 patients admitted to a tertiary care hospital in Shanghai between May 2017 and July 2023, with a final dataset of 2,972 patients, including 158 with PIs. Seven ML algorithms-Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Extra Trees (ET), K-Nearest Neighbors (KNN), and Decision Trees (DT)-were used in conjunction with SMOTE, SMOTE+ENN, Borderline-SMOTE, ADASYN, and GAN to balance the dataset and improve model performance.Results revealed significant improvements in model performance when SMOTE and its variants were used. For instance, the XGBoost model hadan AUC of 0.996 with SMOTE, compared to 0.800 on raw data. SMOTE+ENN and Borderline-SMOTE further enhanced the models' ability to identify minority classes. External validation indicatedthat XGBoost, RF, and ET exhibited the highest stability and accuracy, with XGBoost having an AUC of 0.977. SHAP analysis revealed that factors such as anesthesia grade, age, and serum albumin levels significantly influenced model predictions.In conclusion, integrating SMOTE with ML algorithms effectively addressed a data imbalance and improved the prediction of perioperative PIs. Future work should focus on refining SMOTE techniques and exploring their application to larger, multi-center datasets to enhance the generalizability of these findings, and especially for diseaseswith a lowincidence.

摘要

本研究调查了将机器学习（ML）模型与合成少数过采样技术（SMOTE）及其变体相结合，以预测不平衡数据集中围手术期压力性损伤（PI）的情况。压力性损伤是一个重大的医疗保健问题，常常导致住院时间延长和医疗费用增加。传统的风险评估量表在准确预测压力性损伤方面能力有限，这促使人们探索机器学习技术来应对这一挑战。

我们利用了2017年5月至2023年7月期间在上海一家三级护理医院收治的7292例患者的数据，最终数据集为2972例患者，其中包括158例发生压力性损伤的患者。七种机器学习算法——支持向量机（SVM）、逻辑回归（LR）、随机森林（RF）、极端梯度提升（XGBoost）、额外树（ET）、K近邻（KNN）和决策树（DT）——与SMOTE、SMOTE + ENN、边界SMOTE、ADASYN和生成对抗网络（GAN）结合使用，以平衡数据集并提高模型性能。

结果显示，使用SMOTE及其变体时模型性能有显著提高。例如，XGBoost模型在使用SMOTE时的曲线下面积（AUC）为0.996，而原始数据的AUC为0.800。SMOTE + ENN和边界SMOTE进一步增强了模型识别少数类别的能力。外部验证表明，XGBoost、RF和ET表现出最高的稳定性和准确性，XGBoost的AUC为0.977。SHAP分析表明，麻醉分级、年龄和血清白蛋白水平等因素对模型预测有显著影响。

总之，将SMOTE与机器学习算法相结合有效地解决了数据不平衡问题，并改善了围手术期压力性损伤的预测。未来的工作应侧重于改进SMOTE技术，并探索其在更大的多中心数据集上的应用，以提高这些发现的普遍性，特别是对于低发病率的疾病。

相似文献

Investigating perioperative pressure injuries and factors influencing them with imbalanced samples using a Synthetic Minority Over-sampling Technique.使用合成少数过采样技术对围手术期压力性损伤及其影响因素进行不均衡样本研究。

Biosci Trends. 2025 May 9;19(2):173-188. doi: 10.5582/bst.2025.01013. Epub 2025 Apr 15.

Hospital mortality prediction in traumatic injuries patients: comparing different SMOTE-based machine learning algorithms.创伤性损伤患者的医院死亡率预测：比较不同基于 SMOTE 的机器学习算法。

BMC Med Res Methodol. 2023 Apr 22;23(1):101. doi: 10.1186/s12874-023-01920-w.

Data Augmentation and Machine Learning algorithms for multi-class imbalanced morphometrics data of stingless bees.用于无刺蜂多类不平衡形态测量数据的数据增强和机器学习算法

Heliyon. 2025 Jan 23;11(3):e42214. doi: 10.1016/j.heliyon.2025.e42214. eCollection 2025 Feb 15.

Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略：以脑出血为例。

BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.

Identification of Orphan Genes in Unbalanced Datasets Based on Ensemble Learning.基于集成学习的不平衡数据集中孤儿基因的识别

Front Genet. 2020 Oct 2;11:820. doi: 10.3389/fgene.2020.00820. eCollection 2020.

Machine learning models predict triage levels, massive transfusion protocol activation, and mortality in trauma utilizing patients hemodynamics on admission.机器学习模型利用创伤患者入院时的血流动力学来预测分诊级别、大量输血方案的激活和死亡率。

Comput Biol Med. 2024 Sep;179:108880. doi: 10.1016/j.compbiomed.2024.108880. Epub 2024 Jul 16.

Prediction and feature selection of low birth weight using machine learning algorithms.利用机器学习算法预测和选择低出生体重。

J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8.

Improving Surgical Site Infection Prediction Using Machine Learning: Addressing Challenges of Highly Imbalanced Data.使用机器学习改善手术部位感染预测：应对高度不平衡数据的挑战。

Diagnostics (Basel). 2025 Feb 19;15(4):501. doi: 10.3390/diagnostics15040501.

A hybrid sampling algorithm combining synthetic minority over-sampling technique and edited nearest neighbor for missed abortion diagnosis.一种结合合成少数过采样技术和编辑最近邻的混合采样算法，用于诊断漏诊的流产。

BMC Med Inform Decis Mak. 2022 Dec 29;22(1):344. doi: 10.1186/s12911-022-02075-2.

Machine learning applications to classify and monitor medication adherence in patients with type 2 diabetes in Ethiopia.机器学习在埃塞俄比亚2型糖尿病患者用药依从性分类和监测中的应用。

Front Endocrinol (Lausanne). 2025 Mar 20;16:1486350. doi: 10.3389/fendo.2025.1486350. eCollection 2025.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用合成少数过采样技术对围手术期压力性损伤及其影响因素进行不均衡样本研究。

Investigating perioperative pressure injuries and factors influencing them with imbalanced samples using a Synthetic Minority Over-sampling Technique.

作者信息

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献