Suppr超能文献

全文生物医学文章中的数据库引文。

Database citation in full text biomedical articles.

机构信息

European Molecular Biology Laboratory - European Bioinformatics Institute Wellcome Trust Genome Campus, Cambridge, United Kingdom.

出版信息

PLoS One. 2013 May 29;8(5):e63184. doi: 10.1371/journal.pone.0063184. Print 2013.

Abstract

Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

摘要

分子生物学和文献数据库是生命科学研究的重要基础设施。要有效地整合这些数据资源,就需要在各个文章和生物记录层面上建立结构化的交叉引用。在这里,我们基于对欧洲 PMC 提供的全文开放获取文章的分析,描述了目前数据库条目的研究文章中被引用的模式。我们专注于对欧洲核苷酸档案库(ENA)、UniProt 和蛋白质数据库欧洲版(PDBe)条目的引用,结果表明,出版商提供的期刊文章中数据库记录引用的结构化注释数量通过文本挖掘可增加一倍。通过文本挖掘发现了数千个新的文献数据库关系,因为这些关系也不存在于数据库记录引用的文章集中。我们建议将文章中数据库记录的结构化注释扩展到其他数据库,例如 ArrayExpress 和 Pfam,这些数据库中的条目在文献中也被广泛引用。该文本挖掘管道具有非常高的精度和高通量,因此既可以准确又可以低成本地进行这种活动,从而可以开发新的集成数据服务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/3667078/321e3f5992a6/pone.0063184.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验