La Jolla Institute for Allergy and Immunology, La Jolla, California, United States of America.
Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America.
PLoS Comput Biol. 2018 Nov 8;14(11):e1006494. doi: 10.1371/journal.pcbi.1006494. eCollection 2018 Nov.
Research in computational biology has given rise to a vast number of methods developed to solve scientific problems. For areas in which many approaches exist, researchers have a hard time deciding which tool to select to address a scientific challenge, as essentially all publications introducing a new method will claim better performance than all others. Not all of these claims can be correct. Equally, for this same reason, developers struggle to demonstrate convincingly that they created a new and superior algorithm or implementation. Moreover, the developer community often has difficulty discerning which new approaches constitute true scientific advances for the field. The obvious answer to this conundrum is to develop benchmarks-meaning standard points of reference that facilitate evaluating the performance of different tools-allowing both users and developers to compare multiple tools in an unbiased fashion.
计算生物学的研究已经产生了大量的方法,用于解决科学问题。对于存在多种方法的领域,研究人员很难决定选择哪种工具来解决科学挑战,因为基本上所有介绍新方法的出版物都会声称其性能优于其他所有方法。并非所有这些说法都是正确的。同样,由于这个原因,开发人员也很难令人信服地证明他们创建了一个新的、优越的算法或实现。此外,开发人员社区通常难以辨别哪些新方法真正代表了该领域的科学进步。解决这个难题的明显方法是开发基准-即标准参考点,以方便评估不同工具的性能-允许用户和开发人员以公正的方式比较多种工具。