School of Informatics and Computing, Indiana University, Bloomington, Indiana USA.
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea.
Health Inf Sci Syst. 2013 Feb 4;1:6. doi: 10.1186/2047-2501-1-6. eCollection 2013.
The exponential increase of genomic data brought by the advent of the next or the third generation sequencing (NGS) technologies and the dramatic drop in sequencing cost have driven biological and medical sciences to data-driven sciences. This revolutionary paradigm shift comes with challenges in terms of data transfer, storage, computation, and analysis of big bio/medical data. Cloud computing is a service model sharing a pool of configurable resources, which is a suitable workbench to address these challenges. From the medical or biological perspective, providing computing power and storage is the most attractive feature of cloud computing in handling the ever increasing biological data. As data increases in size, many research organizations start to experience the lack of computing power, which becomes a major hurdle in achieving research goals. In this paper, we review the features of publically available bio and health cloud systems in terms of graphical user interface, external data integration, security and extensibility of features. We then discuss about issues and limitations of current cloud systems and conclude with suggestion of a biological cloud environment concept, which can be defined as a total workbench environment assembling computational tools and databases for analyzing bio/medical big data in particular application domains.
下一代或第三代测序(NGS)技术的出现带来了基因组数据的指数级增长,测序成本的大幅下降推动了生物和医学科学向数据驱动科学的转变。这种革命性的范式转变在大数据的传输、存储、计算和分析方面带来了挑战。云计算是一种共享可配置资源池的服务模型,是解决这些挑战的合适工作台。从医学或生物学的角度来看,提供计算能力和存储是云计算在处理不断增长的生物数据方面最具吸引力的功能。随着数据规模的增长,许多研究机构开始面临计算能力的不足,这成为实现研究目标的主要障碍。在本文中,我们从图形用户界面、外部数据集成、安全性和功能可扩展性等方面,对公共生物和健康云系统的特点进行了综述。然后,我们讨论了当前云系统的问题和局限性,并提出了一个生物云环境概念的建议,该概念可以定义为一个总工作台环境,用于组装计算工具和数据库,以便在特定应用领域分析生物/医学大数据。