Suppr超能文献

海南大学生非致命伤评估:一种探索关键因素的机器学习方法

Assessment of non-fatal injuries among university students in Hainan: a machine learning approach to exploring key factors.

作者信息

Lu Kang, Cao Xiaodong, Wang Lixia, Huang Tao, Chen Lanfang, Wang Xiaodan, Li Qiao

机构信息

School of Public Health, Hainan Medical University, Haikou, China.

出版信息

Front Public Health. 2024 Nov 21;12:1453650. doi: 10.3389/fpubh.2024.1453650. eCollection 2024.

Abstract

BACKGROUND

Injuries constitute a significant global public health concern, particularly among individuals aged 0-34. These injuries are affected by various social, psychological, and physiological factors and are no longer viewed merely as accidental occurrences. Existing research has identified multiple risk factors for injuries; however, they often focus on the cases of children or the older adult, neglecting the university students. Machine learning (ML) can provide advanced analytics and is better suited to complex, nonlinear data compared to traditional methods. That said, ML has been underutilized in injury research despite its great potential. To fill this gap, this study applies ML to analyze injury data among university students in Hainan Province. The purpose is to provide insights into developing effective prevention strategies. To explore the relationship between scores on the self-rating anxiety scale and self-rating depression scale and the risk of non-fatal injuries within 1 year, we categorized these scores into two groups using restricted cubic splines.

METHODS

Chi-square tests and LASSO regression analysis were employed to filter factors potentially associated with non-fatal injuries. The Synthetic Minority Over-Sampling Technique (SMOTE) was applied to balance the dataset. Subsequent analyses were conducted using random forest, logistic regression, decision tree, and XGBoost models. Each model underwent 10-fold cross-validation to mitigate overfitting, with hyperparameters being optimized to improve performance. SHAP was utilized to identify the primary factors influencing non-fatal injuries.

RESULTS

The Random Forest model has proved effective in this study. It identified three primary risk factors for predicting non-fatal injuries: being male, favorable household financial situation, and stable relationship. Protective factors include reduced internet time and being an only child in the family.

CONCLUSION

The study highlighted five key factors influencing non-fatal injuries: sex, household financial situation, relationship stability, internet time, and sibling status. In identifying these factors, the Random Forest, Logistic Regression, Decision Tree, and XGBoost models demonstrated varying effectiveness, with the Random Forest model exhibiting superior performance.

摘要

背景

伤害是一个重大的全球公共卫生问题,在0至34岁的人群中尤为突出。这些伤害受到各种社会、心理和生理因素的影响,不再仅仅被视为意外事件。现有研究已经确定了多种伤害风险因素;然而,它们往往侧重于儿童或老年人的情况,而忽视了大学生。机器学习(ML)可以提供先进的分析方法,与传统方法相比,更适合处理复杂的非线性数据。尽管机器学习有很大潜力,但在伤害研究中却未得到充分利用。为了填补这一空白,本研究应用机器学习来分析海南省大学生的伤害数据。目的是为制定有效的预防策略提供见解。为了探讨自评焦虑量表和自评抑郁量表得分与1年内非致命伤害风险之间的关系,我们使用受限立方样条将这些得分分为两组。

方法

采用卡方检验和LASSO回归分析来筛选可能与非致命伤害相关的因素。应用合成少数过采样技术(SMOTE)来平衡数据集。随后使用随机森林、逻辑回归、决策树和XGBoost模型进行分析。每个模型都进行了10折交叉验证以减轻过拟合,同时优化超参数以提高性能。利用SHAP来识别影响非致命伤害的主要因素。

结果

随机森林模型在本研究中已证明有效。它确定了预测非致命伤害的三个主要风险因素:男性、家庭经济状况良好和关系稳定。保护因素包括减少上网时间和独生子女。

结论

该研究突出了影响非致命伤害的五个关键因素:性别、家庭经济状况、关系稳定性、上网时间和兄弟姐妹状况。在识别这些因素时,随机森林、逻辑回归、决策树和XGBoost模型表现出不同的有效性,其中随机森林模型表现出卓越的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9582/11617571/dbf5f644e5a4/fpubh-12-1453650-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验