Camerlengo Terry, Ozer Hatice Gulcin, Onti-Srinivasan Raghuram, Yan Pearlly, Huang Tim, Parvin Jeffrey, Huang Kun
Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio, USA;
AMIA Jt Summits Transl Sci Proc. 2012;2012:1-10. Epub 2012 Mar 19.
Next Generation Sequencing is highly resource intensive. NGS Tasks related to data processing, management and analysis require high-end computing servers or even clusters. Additionally, processing NGS experiments requires suitable storage space and significant manual interaction. At The Ohio State University's Biomedical Informatics Shared Resource, we designed and implemented a scalable architecture to address the challenges associated with the resource intensive nature of NGS secondary analysis built around Illumina Genome Analyzer II sequencers and Illumina's Gerald data processing pipeline. The software infrastructure includes a distributed computing platform consisting of a LIMS called QUEST (http://bisr.osumc.edu), an Automation Server, a computer cluster for processing NGS pipelines, and a network attached storage device expandable up to 40TB. The system has been architected to scale to multiple sequencers without requiring additional computing or labor resources. This platform provides demonstrates how to manage and automate NGS experiments in an institutional or core facility setting.
下一代测序资源密集度极高。与数据处理、管理及分析相关的下一代测序任务需要高端计算服务器甚至集群。此外,处理下一代测序实验需要合适的存储空间以及大量人工交互。在俄亥俄州立大学的生物医学信息共享资源中心,我们设计并实施了一种可扩展架构,以应对围绕Illumina Genome Analyzer II测序仪和Illumina的Gerald数据处理流程所构建的下一代测序二级分析资源密集型特性带来的挑战。软件基础设施包括一个分布式计算平台,该平台由一个名为QUEST(http://bisr.osumc.edu)的实验室信息管理系统、一个自动化服务器、一个用于处理下一代测序流程的计算机集群以及一个可扩展至40TB的网络附属存储设备组成。该系统的架构设计能够扩展至多个测序仪,而无需额外的计算或劳动力资源。这个平台展示了如何在机构或核心设施环境中管理和自动化下一代测序实验。