一种用于分布式实验中微阵列管理与分析的基于网格的解决方案。

A Grid-based solution for management and analysis of microarrays in distributed experiments.

作者信息

Porro Ivan, Torterolo Livia, Corradi Luca, Fato Marco, Papadimitropoulos Adam, Scaglione Silvia, Schenone Andrea, Viti Federica

机构信息

Computer Science, Systems, and Communication Department, University of Genova, Viale Causa 12, 16100 Genova, Italy.

出版信息

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-8-S1-S7.

DOI:10.1186/1471-2105-8-S1-S7

PMID:17430574

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1885859/

Abstract

Several systems have been presented in the last years in order to manage the complexity of large microarray experiments. Although good results have been achieved, most systems tend to lack in one or more fields. A Grid based approach may provide a shared, standardized and reliable solution for storage and analysis of biological data, in order to maximize the results of experimental efforts. A Grid framework has been therefore adopted due to the necessity of remotely accessing large amounts of distributed data as well as to scale computational performances for terabyte datasets. Two different biological studies have been planned in order to highlight the benefits that can emerge from our Grid based platform. The described environment relies on storage services and computational services provided by the gLite Grid middleware. The Grid environment is also able to exploit the added value of metadata in order to let users better classify and search experiments. A state-of-art Grid portal has been implemented in order to hide the complexity of framework from end users and to make them able to easily access available services and data. The functional architecture of the portal is described. As a first test of the system performances, a gene expression analysis has been performed on a dataset of Affymetrix GeneChip Rat Expression Array RAE230A, from the ArrayExpress database. The sequence of analysis includes three steps: (i) group opening and image set uploading, (ii) normalization, and (iii) model based gene expression (based on PM/MM difference model). Two different Linux versions (sequential and parallel) of the dChip software have been developed to implement the analysis and have been tested on a cluster. From results, it emerges that the parallelization of the analysis process and the execution of parallel jobs on distributed computational resources actually improve the performances. Moreover, the Grid environment have been tested both against the possibility of uploading and accessing distributed datasets through the Grid middleware and against its ability in managing the execution of jobs on distributed computational resources. Results from the Grid test will be discussed in a further paper.

摘要

在过去几年中，已经出现了多个系统来管理大型微阵列实验的复杂性。尽管取得了不错的成果，但大多数系统在一个或多个方面仍存在不足。基于网格的方法可能为生物数据的存储和分析提供一个共享、标准化且可靠的解决方案，以便最大限度地提高实验工作的成果。因此，由于需要远程访问大量分布式数据以及扩展对万亿字节数据集的计算性能，采用了一种网格框架。为了突出我们基于网格的平台可能带来的好处，已经规划了两项不同的生物学研究。所描述的环境依赖于gLite网格中间件提供的存储服务和计算服务。网格环境还能够利用元数据的附加值，以便让用户更好地对实验进行分类和搜索。已经实现了一个先进的网格门户，以向最终用户隐藏框架的复杂性，并使他们能够轻松访问可用的服务和数据。描述了该门户的功能架构。作为对系统性能的首次测试，对来自ArrayExpress数据库的Affymetrix GeneChip Rat Expression Array RAE230A数据集进行了基因表达分析。分析序列包括三个步骤：(i) 分组打开和图像集上传，(ii) 归一化，以及 (iii) 基于模型的基因表达（基于PM/MM差异模型）。已经开发了dChip软件的两个不同Linux版本（顺序版和并行版）来实现分析，并在一个集群上进行了测试。从结果可以看出，分析过程的并行化以及在分布式计算资源上执行并行作业实际上提高了性能。此外，还针对通过网格中间件上传和访问分布式数据集的可能性以及其在管理分布式计算资源上的作业执行能力对网格环境进行了测试。网格测试的结果将在另一篇论文中讨论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0389/1885859/72d5139df645/1471-2105-8-S1-S7-1.jpg

相似文献

A Grid-based solution for management and analysis of microarrays in distributed experiments.一种用于分布式实验中微阵列管理与分析的基于网格的解决方案。

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-8-S1-S7.

MIMAS: an innovative tool for network-based high density oligonucleotide microarray data management and annotation.MIMAS：一种用于基于网络的高密度寡核苷酸微阵列数据管理与注释的创新工具。

BMC Bioinformatics. 2006 Apr 5;7:190. doi: 10.1186/1471-2105-7-190.

The genopolis microarray database.基因城微阵列数据库。

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S21. doi: 10.1186/1471-2105-8-S1-S21.

Bioinformatics. 2007 Nov 15;23(22):3103-4. doi: 10.1093/bioinformatics/btm462. Epub 2007 Sep 25.

TAMEE: data management and analysis for tissue microarrays.TAMEE：组织微阵列的数据管理与分析

BMC Bioinformatics. 2007 Mar 7;8:81. doi: 10.1186/1471-2105-8-81.

A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data.一个基于网络且支持网格计算的dChip版本，用于分析大量基因表达数据。

BMC Bioinformatics. 2008 Nov 13;9:480. doi: 10.1186/1471-2105-9-480.

Biowep: a workflow enactment portal for bioinformatics applications.生物工作流引擎（Biowep）：一个用于生物信息学应用的工作流制定门户。

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S19. doi: 10.1186/1471-2105-8-S1-S19.

GRISSOM platform: enabling distributed processing and management of biological data through fusion of grid and web technologies.格里森平台：通过融合网格和网络技术实现生物数据的分布式处理与管理。

IEEE Trans Inf Technol Biomed. 2011 Jan;15(1):83-92. doi: 10.1109/TITB.2010.2092784. Epub 2010 Nov 15.

Large scale data mining approach for gene-specific standardization of microarray gene expression data.用于微阵列基因表达数据基因特异性标准化的大规模数据挖掘方法。

Bioinformatics. 2006 Dec 1;22(23):2898-904. doi: 10.1093/bioinformatics/btl500. Epub 2006 Oct 10.

The MGED Ontology: a resource for semantics-based description of microarray experiments.MGED本体：用于基于语义的微阵列实验描述的资源。

Bioinformatics. 2006 Apr 1;22(7):866-73. doi: 10.1093/bioinformatics/btl005. Epub 2006 Jan 21.

引用本文的文献

A Survey of Data Mining and Deep Learning in Bioinformatics.生物信息学中的数据挖掘和深度学习调查。

J Med Syst. 2018 Jun 28;42(8):139. doi: 10.1007/s10916-018-1003-9.

GliomaPredict: a clinically useful tool for assigning glioma patients to specific molecular subtypes.GliomaPredict：一种用于将 glioma 患者分配到特定分子亚型的临床有用工具。

BMC Med Inform Decis Mak. 2010 Jul 15;10:38. doi: 10.1186/1472-6947-10-38.

Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes.通过对非结构化表型进行映射和模型理论语义分解实现神经影像数据集与微阵列数据集的整合。

Cancer Inform. 2009 Jun 8;8:75-94. doi: 10.4137/cin.s1046.

Survival Online: a web-based service for the analysis of correlations between gene expression and clinical and follow-up data.生存分析在线：一个基于网络的服务，用于分析基因表达与临床和随访数据之间的相关性。

BMC Bioinformatics. 2009 Oct 15;10 Suppl 12(Suppl 12):S10. doi: 10.1186/1471-2105-10-S12-S10.

A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data.一个基于网络且支持网格计算的dChip版本，用于分析大量基因表达数据。

BMC Bioinformatics. 2008 Nov 13;9:480. doi: 10.1186/1471-2105-9-480.

本文引用的文献

Engineering of osteoinductive grafts by isolation and expansion of ovine bone marrow stromal cells directly on 3D ceramic scaffolds.通过直接在三维陶瓷支架上分离和扩增绵羊骨髓基质细胞来构建骨诱导移植物。

Biotechnol Bioeng. 2006 Jan 5;93(1):181-7. doi: 10.1002/bit.20677.

Three-dimensional perfusion culture of human bone marrow cells and generation of osteoinductive grafts.人骨髓细胞的三维灌注培养及骨诱导移植物的生成

Stem Cells. 2005 Sep;23(8):1066-72. doi: 10.1634/stemcells.2005-0002. Epub 2005 Jul 7.

MARS: microarray analysis, retrieval, and storage system.MARS：微阵列分析、检索与存储系统。

BMC Bioinformatics. 2005 Apr 18;6:101. doi: 10.1186/1471-2105-6-101.

Microarray data analysis: from hypotheses to conclusions using gene expression data.微阵列数据分析：利用基因表达数据从假设得出结论

Cell Oncol. 2004;26(5-6):279-90. doi: 10.1155/2004/943940.

Bioconductor: open software development for computational biology and bioinformatics.生物导体：用于计算生物学和生物信息学的开源软件开发。

Genome Biol. 2004;5(10):R80. doi: 10.1186/gb-2004-5-10-r80. Epub 2004 Sep 15.

The use and analysis of microarray data.微阵列数据的使用与分析。

Nat Rev Drug Discov. 2002 Dec;1(12):951-60. doi: 10.1038/nrd961.

Significance and statistical errors in the analysis of DNA microarray data.DNA微阵列数据分析中的显著性与统计误差

Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12975-8. doi: 10.1073/pnas.162468199. Epub 2002 Sep 16.

On the importance of standardisation in life sciences.论生命科学中标准化的重要性。

Bioinformatics. 2001 Feb;17(2):113-4. doi: 10.1093/bioinformatics/17.2.113.

Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection.基于模型的寡核苷酸阵列分析：表达指数计算与异常值检测。

Proc Natl Acad Sci U S A. 2001 Jan 2;98(1):31-6. doi: 10.1073/pnas.98.1.31.

Biological properties of recombinant alpha-interferons: 40th anniversary of the discovery of interferons.重组α干扰素的生物学特性：干扰素发现40周年

Cancer Res. 1998 Jun 15;58(12):2489-99.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于分布式实验中微阵列管理与分析的基于网格的解决方案。

A Grid-based solution for management and analysis of microarrays in distributed experiments.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献