Suppr超能文献

基于随机森林的特征选择算法在运动效能评估中的研究与性能分析

Research and performance analysis of random forest-based feature selection algorithm in sports effectiveness evaluation.

作者信息

Li Yujiao, Mu Yingjie

机构信息

Harbin Normal University, Harbin, 150025, China.

出版信息

Sci Rep. 2024 Nov 1;14(1):26275. doi: 10.1038/s41598-024-76706-1.

Abstract

The rapid progress in fields such as data mining and machine learning, as well as the explosive growth of sports big data, have posed new challenges to the research of sports big data. Most of the available sports data mining techniques concentrates on extracting and constructing effective features for basic sports data, which cannot be achieved simply by using data statistics. Especially in the targeted mining of sports data, traditional mining techniques still have shortcomings such as low classification accuracy and insufficient refinement. In order to solve the problem of low accuracy in traditional mining methods, the study combines the random forest algorithm with the artificial raindrop algorithm, and adopts a sports data mining method based on feature selection to achieve effective analysis of sports big data. This study is based on the evaluation method of motion effects using random forests, and uses feature extraction algorithms to study the motion effect impacts. It uses the information gain index to rank the importance of features and accurately gain the degree of influence of exercise on various indicators of the human body. Through simulation verification, the algorithm proposed by the research institute performs the best in accuracy and FI scores on the training and testing sets, with accuracies of 0.849 ± 0.021 and 0.819 ± 0.022, respectively, and F1 scores of 0.837 ± 0.020 and 0.864 ± 0.021, respectively. This indicates that the algorithm proposed by the research institute has high classification accuracy and performance proves that the Random Forest-based feature selection algorithm established in this study is superior to the existing traditional feature extraction and extraction methods in terms of both performance and accuracy. The proposal of this data analysis method has achieved accurate and efficient utilization of sports big data, which is of great significance for the development of the sports education industry.

摘要

数据挖掘和机器学习等领域的快速发展,以及体育大数据的爆炸式增长,给体育大数据的研究带来了新的挑战。现有的大多数体育数据挖掘技术都集中在为基础体育数据提取和构建有效特征上,而这无法简单地通过数据统计来实现。特别是在体育数据的针对性挖掘方面,传统挖掘技术仍存在分类准确率低和精细化不足等缺点。为了解决传统挖掘方法中准确率低的问题,该研究将随机森林算法与人工雨滴算法相结合,采用基于特征选择的体育数据挖掘方法,以实现对体育大数据的有效分析。本研究基于随机森林的运动效果评估方法,利用特征提取算法研究运动效果影响。它使用信息增益指数对特征的重要性进行排序,准确获取运动对人体各项指标的影响程度。通过仿真验证,该研究所提出的算法在训练集和测试集上的准确率和F1分数表现最佳,准确率分别为0.849±0.021和0.819±0.022,F1分数分别为0.837±0.020和0.864±0.021。这表明该研究所提出的算法具有较高的分类准确率,其性能证明了本研究建立的基于随机森林的特征选择算法在性能和准确率方面均优于现有的传统特征提取方法。这种数据分析方法的提出实现了对体育大数据的准确高效利用,对体育教育产业的发展具有重要意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7be3/11530685/8348bf8e1f16/41598_2024_76706_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验