Suppr超能文献

机器学习在医疗保健领域是否言过其实?:一项批判性分析及改进评估的建议,并以帕金森病为例提供证据

Has machine learning over-promised in healthcare?: A critical analysis and a proposal for improved evaluation, with evidence from Parkinson's disease.

作者信息

Ge Wenbo, Lueck Christian, Suominen Hanna, Apthorp Deborah

机构信息

School of Computing, Australian National University, 145 Science Road, Acton, 2601, ACT, Australia.

Department of Neurology, Canberra Hospital, Yamba Drive, Garran, 2605, ACT, Australia; ANU Medical School, Australian National University, Hospital Rd, Garran, 2605, ACT, Australia.

出版信息

Artif Intell Med. 2023 May;139:102524. doi: 10.1016/j.artmed.2023.102524. Epub 2023 Mar 16.

Abstract

Adoption of artificial intelligence (AI) by the medical community has long been anticipated, endorsed by a stream of machine learning literature showcasing AI systems that yield extraordinary performance. However, many of these systems are likely over-promising and will under-deliver in practice. One key reason is the community's failure to acknowledge and address the presence of inflationary effects in the data. These simultaneously inflate evaluation performance and prevent a model from learning the underlying task, thus severely misrepresenting how that model would perform in the real world. This paper investigated the impact of these inflationary effects on healthcare tasks, as well as how these effects can be addressed. Specifically, we defined three inflationary effects that occur in medical data sets and allow models to easily reach small training losses and prevent skillful learning. We investigated two data sets of sustained vowel phonation from participants with and without Parkinson's disease, and revealed that published models which have achieved high classification performances on these were artificially enhanced due to the inflationary effects. Our experiments showed that removing each inflationary effect corresponded with a decrease in classification accuracy, and that removing all inflationary effects reduced the evaluated performance by up to 30%. Additionally, the performance on a more realistic test set increased, suggesting that the removal of these inflationary effects enabled the model to better learn the underlying task and generalize. Source code is available at https://github.com/Wenbo-G/pd-phonation-analysis under the MIT license.

摘要

医学界对人工智能(AI)的采用早有预期,一系列展示具有卓越性能的AI系统的机器学习文献也对其表示支持。然而,这些系统中的许多可能都过度承诺了,在实际应用中会达不到预期效果。一个关键原因是该领域未能认识到并解决数据中存在的通胀效应。这些效应会同时夸大评估性能,并阻止模型学习潜在任务,从而严重歪曲该模型在现实世界中的表现。本文研究了这些通胀效应对医疗任务的影响,以及如何解决这些效应。具体来说,我们定义了在医学数据集中出现的三种通胀效应,这些效应使模型能够轻松达到较小的训练损失,并阻止其进行有效的学习。我们研究了有和没有帕金森病参与者的持续元音发声的两个数据集,结果表明,在这些数据集上取得高分类性能的已发表模型因通胀效应而被人为提高了性能。我们的实验表明,消除每种通胀效应都会导致分类准确率下降,而消除所有通胀效应会使评估性能降低多达30%。此外,在更现实的测试集上的性能有所提高,这表明消除这些通胀效应使模型能够更好地学习潜在任务并进行泛化。源代码可在https://github.com/Wenbo-G/pd-phonation-analysis上获取,遵循麻省理工学院许可协议。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验