Zafeiropoulos Haris, Paragkamian Savvas, Ninidakis Stelios, Pavlopoulos Georgios A, Jensen Lars Juhl, Pafilis Evangelos
Department of Biology, University of Crete, Voutes University Campus, P.O. Box 2208, 70013 Heraklion, Crete, Greece.
Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Former U.S. Base of Gournes, P.O. Box 2214, 71003 Heraklion, Crete, Greece.
Microorganisms. 2022 Jan 26;10(2):293. doi: 10.3390/microorganisms10020293.
To elucidate ecosystem functioning, it is fundamental to recognize what processes occur in which environments (where) and which microorganisms carry them out (who). Here, we present PREGO, a one-stop-shop knowledge base providing such associations. PREGO combines text mining and data integration techniques to mine such what-where-who associations from data and metadata scattered in the scientific literature and in public omics repositories. Microorganisms, biological processes, and environment types are identified and mapped to ontology terms from established community resources. Analyses of comentions in text and co-occurrences in metagenomics data/metadata are performed to extract associations and a level of confidence is assigned to each of them thanks to a scoring scheme. The PREGO knowledge base contains associations for 364,508 microbial taxa, 1090 environmental types, 15,091 biological processes, and 7971 molecular functions with a total of almost 58 million associations. These associations are available through a web portal, an Application Programming Interface (API), and bulk download. By exploring environments and/or processes associated with each other or with microbes, PREGO aims to assist researchers in design and interpretation of experiments and their results. To demonstrate PREGO's capabilities, a thorough presentation of its web interface is given along with a meta-analysis of experimental results from a lagoon-sediment study of sulfur-cycle related microbes.
为了阐明生态系统功能,识别在哪些环境(何处)发生了哪些过程以及哪些微生物执行这些过程(谁)是至关重要的。在此,我们展示了PREGO,这是一个提供此类关联的一站式知识库。PREGO结合了文本挖掘和数据集成技术,从分散在科学文献和公共组学知识库中的数据和元数据中挖掘此类“什么-何处-谁”的关联。微生物、生物过程和环境类型被识别并映射到来自既定社区资源的本体术语。通过分析文本中的共现情况以及宏基因组学数据/元数据中的同时出现情况来提取关联,并借助评分方案为每个关联赋予一定程度的置信度。PREGO知识库包含了364,508个微生物分类群、1090种环境类型、15,091个生物过程和7971个分子功能的关联,总共有近5800万个关联。这些关联可通过网络门户、应用程序编程接口(API)和批量下载获取。通过探索相互关联或与微生物相关联的环境和/或过程,PREGO旨在协助研究人员设计和解释实验及其结果。为了展示PREGO的能力,我们对其网络界面进行了全面介绍,并对来自一项关于硫循环相关微生物的泻湖沉积物研究的实验结果进行了荟萃分析。