Suppr超能文献

在假设-演绎框架下体育科学中利用机器学习的研究

On Leveraging Machine Learning in Sport Science in the Hypothetico-deductive Framework.

作者信息

Rodu Jordan, DeJong Lempke Alexandra F, Kupperman Natalie, Hertel Jay

机构信息

Department of Statistics, University of Virginia, Charlottesville, VA, USA.

Department of Physical Medicine and Rehabilitation, School of Medicine, Virginia Commonwealth University, Richmond, VA, USA.

出版信息

Sports Med Open. 2024 Nov 14;10(1):124. doi: 10.1186/s40798-024-00788-4.

Abstract

Supervised machine learning (ML) offers an exciting suite of algorithms that could benefit research in sport science. In principle, supervised ML approaches were designed for pure prediction, as opposed to explanation, leading to a rise in powerful, but opaque, algorithms. Recently, two subdomains of ML-explainable ML, which allows us to "peek into the black box," and interpretable ML, which encourages using algorithms that are inherently interpretable-have grown in popularity. The increased transparency of these powerful ML algorithms may provide considerable support for the hypothetico-deductive framework, in which hypotheses are generated from prior beliefs and theory, and are assessed against data collected specifically to test that hypothesis. However, this paper shows why ML algorithms are fundamentally different from statistical methods, even when using explainable or interpretable approaches. Translating potential insights from supervised ML algorithms, while in many cases seemingly straightforward, can have unanticipated challenges. While supervised ML cannot be used to replace statistical methods, we propose ways in which the sport sciences community can take advantage of supervised ML in the hypothetico-deductive framework. In this manuscript we argue that supervised machine learning can and should augment our exploratory investigations in sport science, but that leveraging potential insights from supervised ML algorithms should be undertaken with caution. We justify our position through a careful examination of supervised machine learning, and provide a useful analogy to help elucidate our findings. Three case studies are provided to demonstrate how supervised machine learning can be integrated into exploratory analysis. Supervised machine learning should be integrated into the scientific workflow with requisite caution. The approaches described in this paper provide ways to safely leverage the strengths of machine learning-like the flexibility ML algorithms can provide for fitting complex patterns-while avoiding potential pitfalls-at best, like wasted effort and money, and at worst, like misguided clinical recommendations-that may arise when trying to integrate findings from ML algorithms into domain knowledge. KEY POINTS: Some supervised machine learning algorithms and statistical models are used to solve the same problem, y = f(x) + ε, but differ fundamentally in motivation and approach. The hypothetico-deductive framework-in which hypotheses are generated from prior beliefs and theory, and are assessed against data collected specifically to test that hypothesis-is one of the core frameworks comprising the scientific method. In the hypothetico-deductive framework, supervised machine learning can be used in an exploratory capacity. However, it cannot replace the use of statistical methods, even as explainable and interpretable machine learning methods become increasingly popular. Improper use of supervised machine learning in the hypothetico-deductive framework is tantamount to p-value hacking in statistical methods.

摘要

监督式机器学习(ML)提供了一系列令人兴奋的算法,有望助力体育科学研究。原则上,监督式ML方法旨在进行纯预测,而非解释,这导致了强大但不透明的算法的兴起。最近,ML的两个子领域——可解释ML(让我们能够“窥视黑箱”)和可诠释ML(鼓励使用本质上可诠释的算法)——越来越受欢迎。这些强大的ML算法透明度的提高,可能会为假设演绎框架提供相当大的支持,在该框架中,假设是根据先前的信念和理论生成的,并根据专门为检验该假设而收集的数据进行评估。然而,本文将说明为什么ML算法与统计方法在根本上不同,即使使用可解释或可诠释的方法也是如此。将监督式ML算法的潜在见解进行转化,虽然在许多情况下看似简单直接,但可能会面临意想不到的挑战。虽然监督式ML不能用于取代统计方法,但我们提出了体育科学界在假设演绎框架中利用监督式ML的方法。在本手稿中,我们认为监督式机器学习能够且应该增强我们在体育科学中的探索性研究,但利用监督式ML算法的潜在见解时应谨慎行事。我们通过仔细审视监督式机器学习来证明我们的立场,并提供一个有用的类比来帮助阐明我们的发现。提供了三个案例研究,以展示如何将监督式机器学习整合到探索性分析中。监督式机器学习应谨慎地整合到科学工作流程中。本文所述的方法提供了安全利用机器学习优势的途径——比如ML算法在拟合复杂模式时所能提供的灵活性——同时避免潜在的陷阱,最好的情况是像精力和金钱的浪费,最坏的情况是像误导性的临床建议——当试图将ML算法的结果整合到领域知识中时可能会出现这些问题。关键点:一些监督式机器学习算法和统计模型用于解决相同的问题,y = f(x) + ε,但在动机和方法上有根本差异。假设演绎框架——其中假设是根据先前的信念和理论生成的,并根据专门为检验该假设而收集的数据进行评估——是构成科学方法的核心框架之一。在假设演绎框架中,监督式机器学习可用于探索性目的。然而,它不能取代统计方法的使用,即使可解释和可诠释的机器学习方法越来越受欢迎。在假设演绎框架中不当使用监督式机器学习等同于统计方法中的p值操纵。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f06e/11564444/89fba4d69391/40798_2024_788_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验