Pavlopoulos Georgios A, Promponas Vasilis J, Ouzounis Christos A, Iliopoulos Ioannis
Division of Basic Sciences, University of Crete Medical School, Heraklion, 71110, Greece.
Methods Mol Biol. 2014;1159:77-92. doi: 10.1007/978-1-4939-0709-0_5.
Nowadays, it is possible to identify terms corresponding to biological entities within passages in biomedical text corpora: critically, their potential relationships then need to be detected. These relationships are typically detected by co-occurrence analysis, revealing associations between bioentities through their coexistence in single sentences and/or entire abstracts. These associations implicitly define networks, whose nodes represent terms/bioentities/concepts being connected by relationship edges; edge weights might represent confidence for these semantic connections.This chapter provides a review of current methods for co-occurrence analysis, focusing on data storage, analysis, and representation. We highlight scenarios of these approaches implemented by useful tools for information extraction and knowledge inference in the field of systems biology. We illustrate the practical utility of two online resources providing services of this type-namely, STRING and BioTextQuest-concluding with a discussion of current challenges and future perspectives in the field.
如今,在生物医学文本语料库的段落中识别与生物实体相对应的术语已成为可能:关键的是,接下来需要检测它们之间的潜在关系。这些关系通常通过共现分析来检测,通过生物实体在单句和/或整个摘要中的共存来揭示它们之间的关联。这些关联隐含地定义了网络,其节点表示由关系边连接的术语/生物实体/概念;边权重可能表示这些语义连接的置信度。本章综述了当前的共现分析方法,重点关注数据存储、分析和表示。我们强调了这些方法在系统生物学领域中用于信息提取和知识推理的有用工具所实现的场景。我们说明了提供此类服务的两个在线资源——STRING和BioTextQuest的实际效用,并讨论了该领域当前的挑战和未来前景。