SimText：一个用于生物医学实体之间相似性的交互式分析和可视化的文本挖掘框架。

SimText: a text mining framework for interactive analysis and visualization of similarities among biomedical entities.

机构信息

Cologne Center for Genomics (CCG), Medical Faculty of the University of Cologne, University Hospital of Cologne, Cologne 50931, Germany.

Universidad del Desarrollo, Centro de Genética y Genómica, Facultad de Medicina Clínica Alemana, Santiago 7590943, Chile.

出版信息

Bioinformatics. 2021 Nov 18;37(22):4285-4287. doi: 10.1093/bioinformatics/btab365.

DOI:10.1093/bioinformatics/btab365

PMID:34037702

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9502138/

Abstract

SUMMARY

Literature exploration in PubMed on a large number of biomedical entities (e.g. genes, diseases or experiments) can be time-consuming and challenging, especially when assessing associations between entities. Here, we describe SimText, a user-friendly toolset that provides customizable and systematic workflows for the analysis of similarities among a set of entities based on text. SimText can be used for (i) text collection from PubMed and extraction of words with different text mining approaches, and (ii) interactive analysis and visualization of data using unsupervised learning techniques in an interactive app.

AVAILABILITY AND IMPLEMENTATION

We developed SimText as an open-source R software and integrated it into Galaxy (https://usegalaxy.eu), an online data analysis platform with supporting self-learning training material available at https://training.galaxyproject.org. A command-line version of the toolset is available for download from GitHub (https://github.com/dlal-group/simtext) or as Docker image (https://hub.docker.com/r/dlalgroup/simtext/tags.).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

在 PubMed 上对大量生物医学实体（例如基因、疾病或实验）进行文献探索可能既耗时又具有挑战性，特别是在评估实体之间的关联时。在这里，我们描述了 SimText，这是一个用户友好的工具集，它提供了可定制和系统的工作流程，用于根据文本分析一组实体之间的相似性。SimText 可用于：（i）从 PubMed 中收集文本并使用不同的文本挖掘方法提取单词，以及（ii）使用无监督学习技术在交互式应用程序中对数据进行交互式分析和可视化。

可用性和实现

我们将 SimText 开发为开源 R 软件，并将其集成到 Galaxy（https://usegalaxy.eu）中，这是一个具有在线数据分析平台和支持自学培训材料的平台，可在 https://training.galaxyproject.org 上获取。该工具集的命令行版本可从 GitHub（https://github.com/dlal-group/simtext）或 Docker 映像（https://hub.docker.com/r/dlalgroup/simtext/tags.）下载。

补充信息

补充数据可在 Bioinformatics 在线获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

SimText：一个用于生物医学实体之间相似性的交互式分析和可视化的文本挖掘框架。

SimText: a text mining framework for interactive analysis and visualization of similarities among biomedical entities.

机构信息

出版信息

SUMMARY

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

摘要

可用性和实现

补充信息

相似文献

引用本文的文献

SimText：一个用于生物医学实体之间相似性的交互式分析和可视化的文本挖掘框架。

SimText: a text mining framework for interactive analysis and visualization of similarities among biomedical entities.

机构信息

出版信息

SUMMARY

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

摘要

可用性和实现

补充信息

相似文献

引用本文的文献