Suppr超能文献

迈向网格上 DNA 序列分析生产平台的初步步骤。

Initial steps towards a production platform for DNA sequence analysis on the grid.

机构信息

Bioinformatics Laboratory, Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, PO Box 22700, 1100 DE Amsterdam, The Netherlands.

出版信息

BMC Bioinformatics. 2010 Dec 14;11:598. doi: 10.1186/1471-2105-11-598.

Abstract

BACKGROUND

Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users.

RESULTS

In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures.

CONCLUSIONS

The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/

摘要

背景

由于高通量 DNA 测序仪的出现,生物信息学面临着新的数据爆炸。数据存储和分析成为本地服务器上的一个问题,因此需要切换到其他 IT 基础架构。网格和工作流技术可以帮助更有效地处理数据,并促进协作。然而,网格接口通常对新手用户不友好。

结果

在这项研究中,我们重新使用了 VL-e 项目中开发的一个用于分析医学图像的平台。数据传输、工作流执行和作业监控都可以通过一个图形界面进行操作。我们开发了两个序列比对工具(BLAST 和 BLAT)的工作流程作为概念验证。分析时间大大缩短。所有工作流程和可执行文件都可供荷兰生命科学网格和 VL-e 医学虚拟组织的成员使用。所有组件都是开源的,可以迁移到其他网格基础架构。

结论

内部专业知识和工具的可用性使新用户更容易使用网格资源。我们的初步结果表明,这是一种实用、强大且可扩展的解决方案,可以解决下一代测序仪部署所带来的容量和协作问题。我们目前每天都在 DNA 测序和其他应用中采用这种方法。更多信息和源代码可通过 http://www.bioinformaticslaboratory.nl/ 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4645/3018473/61c92a619e2e/1471-2105-11-598-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验