Xu Zhen, Escalera Sergio, Pavão Adrien, Richard Magali, Tu Wei-Wei, Yao Quanming, Zhao Huan, Guyon Isabelle
4Paradigm, Beijing 100085, China.
Computer Vision Center, Universitat de Barcelona, 08007 Barcelona, Spain.
Patterns (N Y). 2022 Jun 24;3(7):100543. doi: 10.1016/j.patter.2022.100543. eCollection 2022 Jul 8.
Obtaining a standardized benchmark of computational methods is a major issue in data-science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here, we introduce Codabench, a meta-benchmark platform that is open sourced and community driven for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench is open to everyone free of charge and allows benchmark organizers to fairly compare submissions under the same setting (software, hardware, data, algorithms), with custom protocols and data formats. Codabench has unique features facilitating easy organization of flexible and reproducible benchmarks, such as the possibility of reusing templates of benchmarks and supplying compute resources on demand. Codabench has been used internally and externally on various applications, receiving more than 130 users and 2,500 submissions. As illustrative use cases, we introduce four diverse benchmarks covering graph machine learning, cancer heterogeneity, clinical diagnosis, and reinforcement learning.
获得计算方法的标准化基准是数据科学领域的一个主要问题。目前尚未开发出能在统一环境中实现公平基准测试的专用框架。在此,我们介绍Codabench,这是一个开源且由社区驱动的元基准测试平台,用于针对数据集或任务对算法或软件代理进行基准测试。Codabench的公共实例对所有人免费开放,允许基准测试组织者在相同设置(软件、硬件、数据、算法)下,使用自定义协议和数据格式公平地比较提交的内容。Codabench具有独特的功能,便于轻松组织灵活且可重复的基准测试,例如可以重复使用基准测试模板并按需提供计算资源。Codabench已在内部和外部用于各种应用,拥有超过130名用户和2500份提交内容。作为示例用例,我们介绍了四个不同的基准测试,涵盖图机器学习、癌症异质性、临床诊断和强化学习。