Suppr超能文献

PaperBot:基于网络的开源科学文献搜索和元数据组织工具。

PaperBot: open-source web-based search and metadata organization of scientific literature.

机构信息

Center for Neural Informatics, Structures, & Plasticity; Krasnow Institute for Advanced Study; George Mason University, Fairfax, USA.

Bioengineering Department; George Mason University, Fairfax, USA.

出版信息

BMC Bioinformatics. 2019 Jan 24;20(1):50. doi: 10.1186/s12859-019-2613-z.

Abstract

BACKGROUND

The biomedical literature is expanding at ever-increasing rates, and it has become extremely challenging for researchers to keep abreast of new data and discoveries even in their own domains of expertise. We introduce PaperBot, a configurable, modular, open-source crawler to automatically find and efficiently index peer-reviewed publications based on periodic full-text searches across publisher web portals.

RESULTS

PaperBot may operate stand-alone or it can be easily integrated with other software platforms and knowledge bases. Without user interactions, PaperBot retrieves and stores the bibliographic information (full reference, corresponding email contact, and full-text keyword hits) based on pre-set search logic from a wide range of sources including Elsevier, Wiley, Springer, PubMed/PubMedCentral, Nature, and Google Scholar. Although different publishing sites require different search configurations, the common interface of PaperBot unifies the process from the user perspective. Once saved, all information becomes web accessible allowing efficient triage of articles based on their actual relevance and seamless annotation of suitable metadata content. The platform allows the agile reconfiguration of all key details, such as the selection of search portals, keywords, and metadata dimensions. The tool also provides a one-click option for adding articles manually via digital object identifier or PubMed ID. The microservice architecture of PaperBot implements these capabilities as a loosely coupled collection of distinct modules devised to work separately, as a whole, or to be integrated with or replaced by additional software. All metadata is stored in a schema-less NoSQL database designed to scale efficiently in clusters by minimizing the impedance mismatch between relational model and in-memory data structures.

CONCLUSIONS

As a testbed, we deployed PaperBot to help identify and manage peer-reviewed articles pertaining to digital reconstructions of neuronal morphology in support of the NeuroMorpho.Org data repository. PaperBot enabled the custom definition of both general and neuroscience-specific metadata dimensions, such as animal species, brain region, neuron type, and digital tracing system. Since deployment, PaperBot helped NeuroMorpho.Org more than quintuple the yearly volume of processed information while maintaining a stable personnel workforce.

摘要

背景

生物医学文献的增长率不断提高,即使是在自己的专业领域,研究人员也很难跟上新数据和新发现的步伐。我们引入了 PaperBot,这是一个可配置、模块化、开源的爬虫,它可以根据出版商门户网站上的定期全文搜索自动找到并有效地索引同行评审出版物。

结果

PaperBot 可以独立运行,也可以轻松集成到其他软件平台和知识库中。无需用户交互,PaperBot 就可以根据预设的搜索逻辑从包括 Elsevier、Wiley、Springer、PubMed/PubMedCentral、Nature 和 Google Scholar 在内的广泛来源中检索和存储书目信息(完整参考文献、相应的电子邮件联系人以及全文关键字命中)。虽然不同的出版网站需要不同的搜索配置,但 PaperBot 的通用界面从用户角度统一了这个过程。保存后,所有信息都可以通过网络访问,从而可以根据文章的实际相关性进行高效筛选,并对合适的元数据内容进行无缝标注。该平台允许对所有关键细节进行灵活配置,例如搜索门户、关键字和元数据维度的选择。该工具还提供了一个一键选项,可以通过数字对象标识符或 PubMed ID 手动添加文章。PaperBot 的微服务架构将这些功能实现为一组松散耦合的独立模块,这些模块可以单独工作,也可以作为一个整体工作,或者与其他软件集成或替换。所有元数据都存储在一个无模式的 NoSQL 数据库中,该数据库旨在通过最小化关系模型和内存中数据结构之间的阻抗失配来有效地在集群中扩展。

结论

作为一个测试平台,我们部署了 PaperBot 来帮助识别和管理与神经元形态的数字重建相关的同行评审文章,以支持 NeuroMorpho.Org 数据存储库。PaperBot 允许自定义定义一般和神经科学特定的元数据维度,例如动物物种、脑区、神经元类型和数字跟踪系统。自部署以来,PaperBot 帮助 NeuroMorpho.Org 处理的信息量每年增加了五倍以上,同时保持了稳定的人员劳动力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e92/6345070/957ce18116c0/12859_2019_2613_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验