Orange4WS 中的语义微阵列数据分析的 SegMine 工作流程。

SegMine workflows for semantic microarray data analysis in Orange4WS.

机构信息

JoŽef Stefan Institute, Ljubljana, Slovenia.

出版信息

BMC Bioinformatics. 2011 Oct 26;12:416. doi: 10.1186/1471-2105-12-416.

DOI:10.1186/1471-2105-12-416

PMID:22029475

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3216973/

Abstract

BACKGROUND

In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases). Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets.

RESULTS

We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes.

CONCLUSIONS

Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.

摘要

背景

在实验数据分析中，生物信息学研究人员越来越依赖于能够组合和重用科学工作流程的工具。通过将高级数据挖掘服务作为工作流程组件提供，可以显著提高当前生物信息学工作流环境的实用性。例如，此类服务可以支持从各种分布式数据和知识库（如 GO、KEGG、PubMed 和实验数据库）中进行知识发现。具体而言，诸如语义数据挖掘、链接发现和可视化等前沿数据分析方法尚未提供给研究复杂生物数据集的研究人员。

结果

我们提出了一种新方法 SegMine，用于通过利用一般生物学知识对微阵列数据进行语义分析，以及一个新的工作流环境 Orange4WS，该环境集成了对实现 SegMine 方法的 Web 服务的支持。SegMine 方法由两个主要步骤组成。首先，使用语义子组发现算法构建精心设计的规则，以识别丰富的基因集。然后，使用链接发现服务创建和可视化新的生物学假设。在 Orange4WS 中实现为一组工作流程的 SegMine 的实用性在两个微阵列数据分析应用中得到了证明。在人类干细胞衰老的分析中，使用 SegMine 产生了三个新的研究假设，可以改善对衰老潜在机制的理解和候选标记基因的鉴定。

结论

与现有的数据分析系统相比，SegMine 提供了改进的假设生成和数据解释，适用于易于使用的集成工作流环境中的生物信息学。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bded/3216973/49588e4c1ba9/1471-2105-12-416-1.jpg

相似文献

SegMine workflows for semantic microarray data analysis in Orange4WS.Orange4WS 中的语义微阵列数据分析的 SegMine 工作流程。

BMC Bioinformatics. 2011 Oct 26;12:416. doi: 10.1186/1471-2105-12-416.

Workflows for microarray data processing in the Kepler environment.在 Kepler 环境中进行微阵列数据处理的工作流程。

BMC Bioinformatics. 2012 May 17;13:102. doi: 10.1186/1471-2105-13-102.

Biowep: a workflow enactment portal for bioinformatics applications.生物工作流引擎（Biowep）：一个用于生物信息学应用的工作流制定门户。

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S19. doi: 10.1186/1471-2105-8-S1-S19.

Semantic workflows for benchmark challenges: Enhancing comparability, reusability and reproducibility.用于基准挑战的语义工作流：提高可比性、可重用性和可重复性。

Pac Symp Biocomput. 2019;24:208-219.

Performing statistical analyses on quantitative data in Taverna workflows: an example using R and maxdBrowse to identify differentially-expressed genes from microarray data.在Taverna工作流中对定量数据进行统计分析：使用R和maxdBrowse从微阵列数据中识别差异表达基因的示例。

BMC Bioinformatics. 2008 Aug 7;9:334. doi: 10.1186/1471-2105-9-334.

A web services choreography scenario for interoperating bioinformatics applications.一种用于生物信息学应用程序互操作的网络服务编排场景。

BMC Bioinformatics. 2004 Mar 10;5:25. doi: 10.1186/1471-2105-5-25.

EDGE(3): a web-based solution for management and analysis of Agilent two color microarray experiments.EDGE(3)：一个基于网络的解决方案，用于管理和分析安捷伦双色微阵列实验。

BMC Bioinformatics. 2009 Sep 4;10:280. doi: 10.1186/1471-2105-10-280.

A Scientific Software Product Line for the Bioinformatics domain.一个用于生物信息学领域的科学软件产品线。

J Biomed Inform. 2015 Aug;56:239-64. doi: 10.1016/j.jbi.2015.05.014. Epub 2015 Jun 14.

Automatic construction of gene relation networks using text mining and gene expression data.利用文本挖掘和基因表达数据自动构建基因关系网络。

Med Inform Internet Med. 2004 Jun;29(2):169-83. doi: 10.1080/14639230412331280422.

MAAMD: a workflow to standardize meta-analyses and comparison of affymetrix microarray data.MAAMD：一种标准化 Affymetrix 微阵列数据分析和比较的工作流程。

BMC Bioinformatics. 2014 Mar 12;15:69. doi: 10.1186/1471-2105-15-69.

引用本文的文献

Propositionalization and embeddings: two sides of the same coin.命题化与嵌入：同一枚硬币的两面。

Mach Learn. 2020;109(7):1465-1507. doi: 10.1007/s10994-020-05890-8. Epub 2020 Jun 28.

Bioconductor's EnrichmentBrowser: seamless navigation through combined results of set- & network-based enrichment analysis.生物导体的富集浏览器：通过基于集合和网络的富集分析的综合结果进行无缝导航。

BMC Bioinformatics. 2016 Jan 20;17:45. doi: 10.1186/s12859-016-0884-1.

BioMiner: Paving the Way for Personalized Medicine.生物矿质：为个性化医疗铺平道路。

Cancer Inform. 2015 Apr 20;14:55-63. doi: 10.4137/CIN.S20910. eCollection 2015.

Analysis of Glioblastoma Patients' Plasma Revealed the Presence of MicroRNAs with a Prognostic Impact on Survival and Those of Viral Origin.胶质母细胞瘤患者血浆分析揭示了对生存有预后影响的微小RNA以及病毒来源的微小RNA的存在。

PLoS One. 2015 May 7;10(5):e0125791. doi: 10.1371/journal.pone.0125791. eCollection 2015.

GoMapMan: integration, consolidation and visualization of plant gene annotations within the MapMan ontology.GoMapMan：在 MapMan 本体论中整合、巩固和可视化植物基因注释。

Nucleic Acids Res. 2014 Jan;42(Database issue):D1167-75. doi: 10.1093/nar/gkt1056. Epub 2013 Nov 4.

本文引用的文献

The effect of the intra-S-phase checkpoint on origins of replication in human cells.细胞内 S 期检验点对人源细胞复制起始点的影响。

Genes Dev. 2011 Mar 15;25(6):621-33. doi: 10.1101/gad.2029711.

The roles of transforming growth factor-β and Smad3 signaling in adipocyte differentiation and obesity.转化生长因子-β和 Smad3 信号在脂肪细胞分化和肥胖中的作用。

Biochem Biophys Res Commun. 2011 Apr 1;407(1):68-73. doi: 10.1016/j.bbrc.2011.02.106. Epub 2011 Feb 26.

BioCatalogue: a universal catalogue of web services for the life sciences.生物目录：生命科学领域的通用网络服务目录。

Nucleic Acids Res. 2010 Jul;38(Web Server issue):W689-94. doi: 10.1093/nar/gkq394. Epub 2010 May 19.

How to track cellular aging of mesenchymal stromal cells?如何追踪间充质基质细胞的细胞衰老？

Aging (Albany NY). 2010 Apr;2(4):224-30. doi: 10.18632/aging.100136.

Modular analysis of gene expression data with R.使用 R 进行基因表达数据的模块化分析。

Bioinformatics. 2010 May 15;26(10):1376-7. doi: 10.1093/bioinformatics/btq130. Epub 2010 Apr 5.

GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.GO-Bayes：基于贝叶斯方法的基因本体论过表达分析。

Bioinformatics. 2010 Apr 1;26(7):905-11. doi: 10.1093/bioinformatics/btq059. Epub 2010 Feb 21.

FiGS: a filter-based gene selection workbench for microarray data.FiGS：一个基于过滤的微阵列数据分析基因选择工作平台。

BMC Bioinformatics. 2010 Jan 26;11:50. doi: 10.1186/1471-2105-11-50.

Replicative senescence-associated gene expression changes in mesenchymal stromal cells are similar under different culture conditions.间充质基质细胞复制性衰老相关基因表达变化在不同培养条件下相似。

Haematologica. 2010 Jun;95(6):867-74. doi: 10.3324/haematol.2009.011692. Epub 2010 Jan 6.

The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.MetaCyc 数据库包含代谢途径和酶，以及 BioCyc 集合的途径/基因组数据库。

Nucleic Acids Res. 2010 Jan;38(Database issue):D473-9. doi: 10.1093/nar/gkp875. Epub 2009 Oct 22.

Network visualization and analysis of gene expression data using BioLayout Express(3D).使用BioLayout Express(3D)对基因表达数据进行网络可视化和分析。

Nat Protoc. 2009;4(10):1535-50. doi: 10.1038/nprot.2009.177. Epub 2009 Oct 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Orange4WS 中的语义微阵列数据分析的 SegMine 工作流程。

SegMine workflows for semantic microarray data analysis in Orange4WS.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献