Suppr超能文献

最新技术水平:标杆管理文化的时间顺序

State-of-the-Art: The Temporal Order of Benchmarking Culture.

作者信息

Campolo Alexander

机构信息

Department of Geography, Durham University, Lower Mountjoy, South Road, Durham, DH1 3LE UK.

出版信息

Digit Soc. 2025;4(2):35. doi: 10.1007/s44206-025-00190-x. Epub 2025 May 2.

Abstract

This commentary situates the epistemic values of machine learning's culture of benchmarking and evaluation within larger temporal structures. Beyond questions of validity, whether model comparisons are statistically valid or whether benchmarks adequately represent meaningful tasks or capabilities, it asks how benchmarks produce certain temporal values and expectations. It articulates two hypotheses in response: the first, termed normalizing research, seeks to characterize how benchmarking simultaneously serves a disciplining and motivating function in research, with the effect of minimizing conflict. The second, termed extrapolation, argues that the incremental, progressive rhythm of benchmarking is oriented less towards the future than towards a present state-of-the-art (SOTA). Together, these hypotheses inform a diagnosis of the presentist temporality of benchmarking and evaluation in machine learning.

摘要

本评论将机器学习基准测试与评估文化的认知价值置于更大的时间结构中。除了有效性问题,即模型比较在统计上是否有效,或者基准是否充分代表有意义的任务或能力之外,它还探讨了基准如何产生特定的时间价值和期望。作为回应,它阐述了两个假设:第一个称为规范化研究,旨在描述基准测试如何在研究中同时发挥规范和激励作用,从而减少冲突。第二个称为外推法,认为基准测试的渐进式节奏更多地是针对当前的技术水平(SOTA),而非面向未来。这些假设共同为机器学习中基准测试和评估的当下主义时间性诊断提供了依据。

相似文献

3
Recommendations for machine learning benchmarks in neuroimaging.神经影像学中机器学习基准的建议。
Neuroimage. 2022 Aug 15;257:119298. doi: 10.1016/j.neuroimage.2022.119298. Epub 2022 May 10.
6
Selection, presentism, and pluralist history.
Stud Hist Philos Sci. 2022 Apr;92:60-70. doi: 10.1016/j.shpsa.2022.01.003. Epub 2022 Feb 5.
9

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验