Suppr超能文献

Orange4WS 中的语义微阵列数据分析的 SegMine 工作流程。

SegMine workflows for semantic microarray data analysis in Orange4WS.

机构信息

JoŽef Stefan Institute, Ljubljana, Slovenia.

出版信息

BMC Bioinformatics. 2011 Oct 26;12:416. doi: 10.1186/1471-2105-12-416.

Abstract

BACKGROUND

In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases). Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets.

RESULTS

We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes.

CONCLUSIONS

Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.

摘要

背景

在实验数据分析中,生物信息学研究人员越来越依赖于能够组合和重用科学工作流程的工具。通过将高级数据挖掘服务作为工作流程组件提供,可以显著提高当前生物信息学工作流环境的实用性。例如,此类服务可以支持从各种分布式数据和知识库(如 GO、KEGG、PubMed 和实验数据库)中进行知识发现。具体而言,诸如语义数据挖掘、链接发现和可视化等前沿数据分析方法尚未提供给研究复杂生物数据集的研究人员。

结果

我们提出了一种新方法 SegMine,用于通过利用一般生物学知识对微阵列数据进行语义分析,以及一个新的工作流环境 Orange4WS,该环境集成了对实现 SegMine 方法的 Web 服务的支持。SegMine 方法由两个主要步骤组成。首先,使用语义子组发现算法构建精心设计的规则,以识别丰富的基因集。然后,使用链接发现服务创建和可视化新的生物学假设。在 Orange4WS 中实现为一组工作流程的 SegMine 的实用性在两个微阵列数据分析应用中得到了证明。在人类干细胞衰老的分析中,使用 SegMine 产生了三个新的研究假设,可以改善对衰老潜在机制的理解和候选标记基因的鉴定。

结论

与现有的数据分析系统相比,SegMine 提供了改进的假设生成和数据解释,适用于易于使用的集成工作流环境中的生物信息学。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bded/3216973/49588e4c1ba9/1471-2105-12-416-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验