The Bioinformation and DDBJ Center, National Institute of Genetics, Mishima, Shizuoka, 411-8540, Japan.
Nucleic Acids Res. 2020 Jan 8;48(D1):D45-D50. doi: 10.1093/nar/gkz982.
The Bioinformation and DDBJ Center (https://www.ddbj.nig.ac.jp) in the National Institute of Genetics (NIG) maintains a primary nucleotide sequence database as a member of the International Nucleotide Sequence Database Collaboration (INSDC) in partnership with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The NIG operates the NIG supercomputer as a computational basis for the construction of DDBJ databases and as a large-scale computational resource for Japanese biologists and medical researchers. In order to accommodate the rapidly growing amount of deoxyribonucleic acid (DNA) nucleotide sequence data, NIG replaced its supercomputer system, which is designed for big data analysis of genome data, in early 2019. The new system is equipped with 30 PB of DNA data archiving storage; large-scale parallel distributed file systems (13.8 PB in total) and 1.1 PFLOPS computation nodes and graphics processing units (GPUs). Moreover, as a starting point of developing multi-cloud infrastructure of bioinformatics, we have also installed an automatic file transfer system that allows users to prevent data lock-in and to achieve cost/performance balance by exploiting the most suitable environment from among the supercomputer and public clouds for different workloads.
国立遗传学研究所(NIG)的生物信息学和 DDBJ 中心(https://www.ddbj.nig.ac.jp)作为国际核苷酸序列数据库合作组织(INSDC)的成员,与美国国家生物技术信息中心和欧洲生物信息学研究所合作,维护着一个主要的核苷酸序列数据库。NIG 运营着 NIG 超级计算机,作为 DDBJ 数据库构建的计算基础,也是日本生物学家和医学研究人员的大规模计算资源。为了适应脱氧核糖核酸(DNA)核苷酸序列数据的快速增长,NIG 在 2019 年初更换了其超级计算机系统,该系统专为基因组数据的大数据分析而设计。新系统配备了 30 PB 的 DNA 数据归档存储;大型并行分布式文件系统(总计 13.8 PB)和 1.1 PFLOPS 计算节点和图形处理单元(GPU)。此外,作为开发生物信息学多云基础设施的起点,我们还安装了一个自动文件传输系统,允许用户通过为不同的工作负载从超级计算机和公共云中利用最合适的环境,防止数据锁定并实现成本/性能平衡。