Karp Peter D
Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA. Tel:650-859-4358; Fax: 650-859-3735; e-mail:
Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw149. Print 2016.
Can we decrease the costs of database curation by crowd-sourcing curation work or by offloading curation to publication authors? This perspective considers the significant experience accumulated by the bioinformatics community with these two alternatives to professional curation in the last 20 years; that experience should be carefully considered when formulating new strategies for biological databases. The vast weight of empirical evidence to date suggests that crowd-sourced curation is not a successful model for biological databases. Multiple approaches to crowd-sourced curation have been attempted by multiple groups, and extremely low participation rates by 'the crowd' are the overwhelming outcome. The author-curation model shows more promise for boosting curator efficiency. However, its limitations include that the quality of author-submitted annotations is uncertain, the response rate is low (but significant), and to date author curation has involved relatively simple forms of annotation involving one or a few types of data. Furthermore, shifting curation to authors may simply redistribute costs rather than decreasing costs; author curation may in fact increase costs because of the overhead involved in having every curating author learn what professional curators know: curation conventions, curation software and curation procedures.
我们能否通过众包编目工作或将编目工作交给论文作者来降低数据库编目的成本?本文探讨了生物信息学界在过去20年中使用这两种替代专业编目的方法所积累的丰富经验;在制定生物数据库的新策略时,应仔细考虑这些经验。迄今为止,大量的实证证据表明,众包编目并非生物数据库的成功模式。多个团队尝试了多种众包编目的方法,而“大众”的参与率极低是压倒性的结果。作者编目模式在提高编目效率方面显示出更大的潜力。然而,其局限性包括作者提交注释的质量不确定、回复率低(但很显著),而且迄今为止,作者编目涉及的注释形式相对简单,只涉及一种或几种类型的数据。此外,将编目工作转移给作者可能只会重新分配成本,而不是降低成本;实际上,作者编目可能会增加成本,因为每个编目作者都需要学习专业编目人员所掌握的知识:编目规范、编目软件和编目程序,这会带来额外的管理费用。