Suppr超能文献

众包和作者投稿作为专业编目的替代方式。

Crowd-sourcing and author submission as alternatives to professional curation.

作者信息

Karp Peter D

机构信息

Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA. Tel:650-859-4358; Fax: 650-859-3735; e-mail:

出版信息

Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw149. Print 2016.

Abstract

Can we decrease the costs of database curation by crowd-sourcing curation work or by offloading curation to publication authors? This perspective considers the significant experience accumulated by the bioinformatics community with these two alternatives to professional curation in the last 20 years; that experience should be carefully considered when formulating new strategies for biological databases. The vast weight of empirical evidence to date suggests that crowd-sourced curation is not a successful model for biological databases. Multiple approaches to crowd-sourced curation have been attempted by multiple groups, and extremely low participation rates by 'the crowd' are the overwhelming outcome. The author-curation model shows more promise for boosting curator efficiency. However, its limitations include that the quality of author-submitted annotations is uncertain, the response rate is low (but significant), and to date author curation has involved relatively simple forms of annotation involving one or a few types of data. Furthermore, shifting curation to authors may simply redistribute costs rather than decreasing costs; author curation may in fact increase costs because of the overhead involved in having every curating author learn what professional curators know: curation conventions, curation software and curation procedures.

摘要

我们能否通过众包编目工作或将编目工作交给论文作者来降低数据库编目的成本?本文探讨了生物信息学界在过去20年中使用这两种替代专业编目的方法所积累的丰富经验;在制定生物数据库的新策略时,应仔细考虑这些经验。迄今为止,大量的实证证据表明,众包编目并非生物数据库的成功模式。多个团队尝试了多种众包编目的方法,而“大众”的参与率极低是压倒性的结果。作者编目模式在提高编目效率方面显示出更大的潜力。然而,其局限性包括作者提交注释的质量不确定、回复率低(但很显著),而且迄今为止,作者编目涉及的注释形式相对简单,只涉及一种或几种类型的数据。此外,将编目工作转移给作者可能只会重新分配成本,而不是降低成本;实际上,作者编目可能会增加成本,因为每个编目作者都需要学习专业编目人员所掌握的知识:编目规范、编目软件和编目程序,这会带来额外的管理费用。

相似文献

1
Crowd-sourcing and author submission as alternatives to professional curation.
Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw149. Print 2016.
3
Hybrid curation of gene-mutation relations combining automated extraction and crowdsourcing.
Database (Oxford). 2014 Sep 22;2014. doi: 10.1093/database/bau094. Print 2014.
6
Crowdsourcing and curation: perspectives from biology and natural language processing.
Database (Oxford). 2016 Aug 7;2016. doi: 10.1093/database/baw115. Print 2016.
7
Can we replace curation with information extraction software?
Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw150. Print 2016.
8
How much does curation cost?
Database (Oxford). 2016 Aug 7;2016. doi: 10.1093/database/baw110. Print 2016.
9
A crowdsourcing open platform for literature curation in UniProt.
PLoS Biol. 2021 Dec 6;19(12):e3001464. doi: 10.1371/journal.pbio.3001464. eCollection 2021 Dec.

引用本文的文献

2
PubChem synonym filtering process using crowdsourcing.
J Cheminform. 2024 Jun 16;16(1):69. doi: 10.1186/s13321-024-00868-3.
3
AI and the democratization of knowledge.
Sci Data. 2024 Mar 5;11(1):268. doi: 10.1038/s41597-024-03099-1.
4
Crowdsourcing biocuration: The Community Assessment of Community Annotation with Ontologies (CACAO).
PLoS Comput Biol. 2021 Oct 28;17(10):e1009463. doi: 10.1371/journal.pcbi.1009463. eCollection 2021 Oct.
9
Biocuration: Distilling data into knowledge.
PLoS Biol. 2018 Apr 16;16(4):e2002846. doi: 10.1371/journal.pbio.2002846. eCollection 2018 Apr.
10
PhenoPlasm: a database of disruption phenotypes for malaria parasite genes.
Wellcome Open Res. 2017 Jul 24;2:45. doi: 10.12688/wellcomeopenres.11896.2. eCollection 2017.

本文引用的文献

1
Can we replace curation with information extraction software?
Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw150. Print 2016.
4
BIOMEDICAL RESOURCES. Funding for key data resources in jeopardy.
Science. 2016 Jan 1;351(6268):14. doi: 10.1126/science.351.6268.14.
5
Perspective: Sustaining the big-data ecosystem.
Nature. 2015 Nov 5;527(7576):S16-7. doi: 10.1038/527S16a.
6
Canto: an online tool for community literature curation.
Bioinformatics. 2014 Jun 15;30(12):1791-2. doi: 10.1093/bioinformatics/btu103. Epub 2014 Feb 25.
7
Directly e-mailing authors of newly published papers encourages community curation.
Database (Oxford). 2012 May 2;2012:bas024. doi: 10.1093/database/bas024. Print 2012.
8
The Gene Wiki in 2011: community intelligence applied to human gene annotation.
Nucleic Acids Res. 2012 Jan;40(Database issue):D1255-61. doi: 10.1093/nar/gkr925. Epub 2011 Nov 10.
9
Rfam: Wikipedia, clans and the "decimal" release.
Nucleic Acids Res. 2011 Jan;39(Database issue):D141-5. doi: 10.1093/nar/gkq1129. Epub 2010 Nov 9.
10
The Genome Sequence DataBase (GSDB): meeting the challenge of genomic sequencing.
Nucleic Acids Res. 1996 Jan 1;24(1):13-6. doi: 10.1093/nar/24.1.13.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验