Bousfield David, McEntyre Johanna, Velankar Sameer, Papadatos George, Bateman Alex, Cochrane Guy, Kim Jee-Hyub, Graef Florian, Vartak Vid, Alako Blaise, Blomberg Niklas
ELIXIR, Wellcome Genome Campus, Cambridge, UK; Ganesha Associates, Cambridge, UK.
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK.
F1000Res. 2016 Feb 11;5. doi: 10.12688/f1000research.7911.1. eCollection 2016.
Data from open access biomolecular data resources, such as the European Nucleotide Archive and the Protein Data Bank are extensively reused within life science research for comparative studies, method development and to derive new scientific insights. Indicators that estimate the extent and utility of such secondary use of research data need to reflect this complex and highly variable data usage. By linking open access scientific literature, via Europe PubMedCentral, to the metadata in biological data resources we separate data citations associated with a deposition statement from citations that capture the subsequent, long-term, reuse of data in academia and industry. We extend this analysis to begin to investigate citations of biomolecular resources in patent documents. We find citations in more than 8,000 patents from 2014, demonstrating substantial use and an important role for data resources in defining biological concepts in granted patents to both academic and industrial innovators. Combined together our results indicate that the citation patterns in biomedical literature and patents vary, not only due to citation practice but also according to the data resource cited. The results guard against the use of simple metrics such as citation counts and show that indicators of data use must not only take into account citations within the biomedical literature but also include reuse of data in industry and other parts of society by including patents and other scientific and technical documents such as guidelines, reports and grant applications.
来自开放获取生物分子数据资源(如欧洲核苷酸档案库和蛋白质数据库)的数据在生命科学研究中被广泛重新用于比较研究、方法开发以及获取新的科学见解。估计此类研究数据二次使用的范围和效用的指标需要反映这种复杂且高度可变的数据使用情况。通过欧洲 PubMedCentral 将开放获取的科学文献与生物数据资源中的元数据相链接,我们将与存档声明相关的数据引用与记录数据在学术界和工业界后续长期再利用的引用区分开来。我们扩展了这一分析,开始研究专利文件中生物分子资源的引用情况。我们在 2014 年的 8000 多项专利中发现了引用,这表明数据资源在为学术和工业创新者授予的专利中定义生物学概念方面有大量使用且发挥着重要作用。综合我们的结果表明,生物医学文献和专利中的引用模式各不相同,这不仅是由于引用惯例,还取决于所引用的数据资源。这些结果警示不要使用诸如引用次数等简单指标,并表明数据使用指标不仅必须考虑生物医学文献中的引用,还应通过纳入专利以及其他科学技术文件(如指南、报告和资助申请)来涵盖数据在工业界和社会其他部分的再利用情况。
PeerJ. 2013-10-1
Ultraschall Med. 2016-8
J Med Libr Assoc. 2022-1-1
Cochrane Database Syst Rev. 2022-2-1
PLoS One. 2013-5-29
J Biomed Semantics. 2015-1-5
Bioinformatics. 2021-8-25
Bioinformatics. 2020-4-15
Sci Data. 2018-10-16
Acta Crystallogr D Struct Biol. 2018-3-2
Sci Data. 2023-6-2
Nucleic Acids Res. 2016-1-4
Nucleic Acids Res. 2016-1-4
Drug Discov Today Technol. 2015-7
Curr Protoc Bioinformatics. 2015-6-19
PLoS One. 2015-4-15
J Biomed Semantics. 2015-1-5
Nucleic Acids Res. 2015-1
Nucleic Acids Res. 2015-1