Ma Hongyu, Ma Chunyan, Li Shujuan, Jiang Wei, Li Xincang, Liu Yuexing, Ma Lingbo
East China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Shanghai, China; Key Laboratory of East China Sea and Oceanic Fishery Resources Exploitation, Ministry of Agriculture, Shanghai, China.
PLoS One. 2014 Jul 23;9(7):e102668. doi: 10.1371/journal.pone.0102668. eCollection 2014.
In this study, we reported the characterization of the first transcriptome of the mud crab (Scylla paramamosain). Pooled cDNAs of four tissue types from twelve wild individuals were sequenced using the Roche 454 FLX platform. Analysis performed included de novo assembly of transcriptome sequences, functional annotation, and molecular marker discovery. A total of 1,314,101 high quality reads with an average length of 411 bp were generated by 454 sequencing on a mixed cDNA library. De novo assembly of these 1,314,101 reads produced 76,778 contigs (consisting of 818,154 reads) with 5.4-fold average sequencing coverage. The remaining 495,947 reads were singletons. A total of 78,268 unigenes were identified based on sequence similarity with known proteins (E≤0.00001) in UniProt and non-redundant protein databases. Meanwhile, 44,433 sequences were identified (E≤0.00001) using a BLASTN search against the NCBI nucleotide database. Gene Ontology (GO) analysis indicated that biosynthetic process, cell part, and ion binding were the most abundant terms in biological process, cellular component, and molecular function categories, respectively. Kyoto Encyclopedia of Genes and Genome (KEGG) pathway analysis revealed that 4,878 unigenes distributed in 281 different pathways. In addition, 19,011 microsatellites and 37,063 potential single nucleotide polymorphisms were detected from the transcriptome of S. paramamosain. Finally, thirty polymorphic microsatellite markers were developed and used to assess genetic diversity of a wild population of S. paramamosain. So far, existing sequence resources for S. paramamosain are extremely limited. The present study provides a characterization of transcriptome from multiple tissues and individuals, as well as an assessment of genetic diversity of a wild population. These sequence resources will facilitate the investigation of population genetic diversity, the development of genetic maps, and the conduct of molecular marker-assisted breeding in S. paramamosain and related crab species.
在本研究中,我们报道了拟穴青蟹(Scylla paramamosain)首个转录组的特征。使用罗氏454 FLX平台对来自12个野生个体的四种组织类型的混合cDNA进行了测序。进行的分析包括转录组序列的从头组装、功能注释和分子标记发现。通过对混合cDNA文库进行454测序,共产生了1,314,101条高质量读段,平均长度为411 bp。对这1,314,101条读段进行从头组装,产生了76,778个重叠群(由818,154条读段组成),平均测序覆盖度为5.4倍。其余495,947条读段为单拷贝序列。基于与UniProt和非冗余蛋白质数据库中已知蛋白质的序列相似性(E≤0.00001),共鉴定出78,268个单基因。同时,使用BLASTN搜索NCBI核苷酸数据库,鉴定出44,433个序列(E≤0.00001)。基因本体论(GO)分析表明,生物合成过程、细胞部分和离子结合分别是生物过程、细胞成分和分子功能类别中最丰富的术语。京都基因与基因组百科全书(KEGG)通路分析显示,4,878个单基因分布在281条不同的通路中。此外,从拟穴青蟹的转录组中检测到19,011个微卫星和37,063个潜在的单核苷酸多态性。最后,开发了30个多态性微卫星标记,并用于评估拟穴青蟹野生种群的遗传多样性。到目前为止,拟穴青蟹现有的序列资源极其有限。本研究提供了多个组织和个体的转录组特征,以及对野生种群遗传多样性的评估。这些序列资源将有助于拟穴青蟹及相关蟹类物种的种群遗传多样性研究、遗传图谱的构建以及分子标记辅助育种的开展。