Van Quyen Dong, Gan Han Ming, Lee Yin Peng, Nguyen Dinh Duy, Nguyen Thi Hoa, Tran Xuan Thach, Nguyen Van Sang, Khang Dinh Duy, Austin Christopher M
Laboratory of Molecular Microbiology, Institute of Biotechnology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Hanoi, Viet Nam; University of Science and Technology of Hanoi, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Hanoi, Viet Nam.
Centre of Integrative Ecology, School of Life and Environmental Sciences Deakin University, Geelong, Australia; Deakin Genomics Centre, Deakin University, Geelong, Australia.
Mar Genomics. 2020 Aug;52:100751. doi: 10.1016/j.margen.2020.100751. Epub 2020 Feb 4.
World production of farmed crustaceans was 7.8 million tons in 2016. While only making up approximately 10% of world aquaculture production, crustaceans are generally high-value species and can earn significant export income for producing countries. Viet Nam is a major seafood producing country earning USD 7.3 billion in 2016 in export income with shrimp as a major commodity. However, there is a general lack of genomic resources available for shrimp species, which is challenging to obtain due to the need to deal with large repetitive genomes, which characterize many decapod crustaceans. The first tiger prawn (P. monodon) genome assembly was assembled in 2016 using the standard Illumina PCR-based pair-end reads and a computationally-efficient but relatively suboptimal assembler, SOAPdenovo v2. As a result, the current P. monodon draft genome is highly fragmented (> 2 million scaffolds with N length of <1000 bp), exhibiting only moderate genome completeness (< 35% BUSCO complete single-copy genes). We sought to improve upon the recently published P. monodon genome assembly and completeness by generating Illumina PCR-free pair-end sequencing reads to eliminate genomic gaps associated with PCR-bias and performing de novo assembly using the updated MaSurCA de novo assembler. Furthermore, we scaffolded the assembly with low coverage Nanopore long reads and several recently published deep Illumina transcriptome paired-end sequencing data, producing a final genome assembly of 1.6 Gbp (1,211,364 scaffolds; N length of 1982 bp) with an Arthropod BUSCO completeness of 96.8%. Compared to the previously published P. monodon genome assembly from China (NCBI Accession Code: NIUS01), this represents an almost 20% increase in the overall BUSCO genome completeness that now consists of more than 90% of Arthropod BUSCO single-copy genes. The revised P. monodon genome assembly (NCBI Accession Code: VIGR01) will be a valuable resource to support ongoing functional genomics and molecular-based breeding studies in Vietnam.
2016年,全球养殖甲壳类动物产量为780万吨。甲壳类动物虽然仅占全球水产养殖产量的约10%,但通常是高价值物种,可为生产国带来可观的出口收入。越南是一个主要的海产品生产国,2016年出口收入达73亿美元,虾是主要商品。然而,虾类物种普遍缺乏基因组资源,由于需要处理许多十足目甲壳类动物所特有的大型重复基因组,获取这些资源具有挑战性。首个斑节对虾(P. monodon)基因组组装于2016年完成,使用的是基于Illumina PCR的标准双端读数和一个计算效率高但相对次优的组装器SOAPdenovo v2。结果,当前的斑节对虾基因组草图高度碎片化(超过200万个支架,N长度<1000 bp),基因组完整性仅为中等水平(<35%的BUSCO完整单拷贝基因)。我们试图通过生成无PCR的Illumina双端测序读数来消除与PCR偏差相关的基因组缺口,并使用更新后的MaSurCA从头组装器进行从头组装,以改进最近发表的斑节对虾基因组组装和完整性。此外,我们用低覆盖率的纳米孔长读数和一些最近发表的深度Illumina转录组双端测序数据对组装进行支架搭建,最终得到一个1.6 Gbp的基因组组装(1,211,364个支架;N长度为1982 bp),节肢动物BUSCO完整性为96.8%。与之前发表的来自中国的斑节对虾基因组组装(NCBI登录号:NIUS01)相比,这代表着总体BUSCO基因组完整性几乎提高了20%,现在由超过90%的节肢动物BUSCO单拷贝基因组成。修订后的斑节对虾基因组组装(NCBI登录号:VIGR01)将成为支持越南正在进行的功能基因组学和分子育种研究的宝贵资源。