Division of Health Medical Intelligence, Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo 108-8639, Japan.
Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT), Heidelberg 69120, Germany.
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae357.
Functional interpretation of biological entities such as differentially expressed genes is one of the fundamental analyses in bioinformatics. The task can be addressed by using biological pathway databases with enrichment analysis (EA). However, textual description of biological entities in public databases is less explored and integrated in existing tools and it has a potential to reveal new mechanisms. Here, we present a new R package biotextgraph for graphical summarization of omics' textual description data which enables assessment of functional similarities of the lists of biological entities. We illustrate application examples of annotating gene identifiers in addition to EA. The results suggest that the visualization based on words and inspection of biological entities with text can reveal a set of biologically meaningful terms that could not be obtained by using biological pathway databases alone. The results suggest the usefulness of the package in the routine analysis of omics-related data. The package also offers a web-based application for convenient querying.
The package, documentation, and web server are available at: https://github.com/noriakis/biotextgraph.
生物信息学中的基本分析之一是对差异表达基因等生物实体进行功能解释。可以使用具有富集分析(EA)的生物途径数据库来解决该任务。但是,公共数据库中生物实体的文本描述在现有工具中较少被探索和整合,并且具有揭示新机制的潜力。在这里,我们提出了一个新的 R 包 biotextgraph,用于对组学的文本描述数据进行图形总结,从而能够评估生物实体列表的功能相似性。我们说明了除了 EA 之外,还可以注释基因标识符的应用示例。结果表明,基于单词的可视化和对具有文本的生物实体的检查可以揭示一组仅凭生物途径数据库无法获得的具有生物学意义的术语。结果表明,该软件包在常规分析与组学相关的数据时非常有用。该软件包还提供了一个基于网络的应用程序,方便查询。
该软件包、文档和网络服务器可在以下网址获得:https://github.com/noriakis/biotextgraph。