Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8510, Japan.
DNA Res. 2013 Aug;20(4):383-90. doi: 10.1093/dnares/dst017. Epub 2013 May 8.
High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.
高通量下一代测序(NGS)技术正在推动基因组学和分子生物学研究的发展。然而,大量的测序数据需要计算技能和合适的硬件资源,这对分子生物学家来说是一个挑战。日本国立遗传学研究所(NIG)的 DNA 数据库(DDBJ)已经启动了一个基于云计算的分析管道,即 DDBJ Read Annotation Pipeline(DDBJ Pipeline),用于高通量注释 NGS reads。DDBJ Pipeline 提供了一个用户友好的图形化 Web 界面,并使用 NIG 超级计算机的去中心化处理来处理大规模的 NGS 数据集,目前是免费的。该流水线由两个分析组件组成:用于参考基因组映射和从头组装的基本分析,以及结构和功能注释的后续高级分析。用户可以在流水线中的两个组件之间平滑切换,便于在超级计算机上进行基于 Web 的操作,以实现高通量数据分析。此外,通过仅输入访问号,就可以将 DDBJ Sequence Read Archive 上的公共 NGS reads 导入到流水线中。该流水线将通过应用于 NGS 数据的统一分析工作流程来促进研究。DDBJ Pipeline 可在 http://p.ddbj.nig.ac.jp/ 访问。