Kang Se Won, Patnaik Bharat Bhusan, Hwang Hee-Ju, Park So Young, Chung Jong Min, Song Dae Kwon, Patnaik Hongray Howrelia, Lee Jae Bong, Kim Changmu, Kim Soonok, Park Hong Seog, Park Seung-Hwan, Park Young-Su, Han Yeon Soo, Lee Jun Sang, Lee Yong Seok
Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungcheongnam-do 31538, Republic of Korea.
Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungcheongnam-do 31538, Republic of Korea; Trident School of Biotech Sciences, Trident Academy of Creative Technology (TACT), Bhubaneswar, Odisha, 751024, India.
Comp Biochem Physiol Part D Genomics Proteomics. 2017 Mar;21:77-89. doi: 10.1016/j.cbd.2016.10.004. Epub 2016 Oct 29.
Satsuma myomphala is critically endangered through loss of natural habitats, predation by natural enemies, and indiscriminate collection. It is a protected species in Korea but lacks genomic resources for an understanding of varied functional processes attributable to evolutionary success under natural habitats. For assessing the genetic information of S. myomphala, we performed for the first time, de novo transcriptome sequencing and functional annotation of expressed sequences using Illumina Next-Generation Sequencing (NGS) platform and bioinformatics analysis. We identified 103,774 unigenes of which 37,959, 12,890, and 17,699 were annotated in the PANM (Protostome DB), Unigene, and COG (Clusters of Orthologous Groups) databases, respectively. In addition, 14,451 unigenes were predicted under Gene Ontology functional categories, with 4581 assigned to a single category. Furthermore, 3369 sequences with 646 having Enzyme Commission (EC) numbers were mapped to 122 pathways in the Kyoto Encyclopedia of Genes and Genomes Pathway database. The prominent protein domains included the Zinc finger (C2H2-like), Reverse Transcriptase, Thioredoxin-like fold, and RNA recognition motif domain. Many unigenes with homology to immunity, defense, and reproduction-related genes were screened in the transcriptome. We also detected 3120 putative simple sequence repeats (SSRs) encompassing dinucleotide to hexanucleotide repeat motifs from >1kb unigene sequences. A list of PCR primers of SSR loci have been identified to study the genetic polymorphisms. The transcriptome data represents a valuable resource for further investigations on the species genome structure and biology. The unigenes information and microsatellites would provide an indispensable tool for conservation of the species in natural and adaptive environments.
萨摩鳞沙蚕因自然栖息地丧失、天敌捕食和滥采滥伐而极度濒危。它是韩国的保护物种,但缺乏基因组资源来了解其在自然栖息地中因进化成功而产生的各种功能过程。为了评估萨摩鳞沙蚕的遗传信息,我们首次使用Illumina下一代测序(NGS)平台和生物信息学分析对其进行了从头转录组测序和表达序列的功能注释。我们鉴定出103,774个单基因,其中分别有37,959个、12,890个和17,699个在PANM(原口动物数据库)、Unigene和COG(直系同源簇)数据库中得到注释。此外,在基因本体功能类别下预测了14,451个单基因,其中4581个被归为单一类别。此外,3369个序列(其中646个具有酶委员会(EC)编号)被映射到京都基因与基因组百科全书通路数据库中的122条通路。突出的蛋白质结构域包括锌指(C2H2样)、逆转录酶、硫氧还蛋白样折叠和RNA识别基序结构域。在转录组中筛选出了许多与免疫、防御和繁殖相关基因具有同源性的单基因。我们还从大于1kb的单基因序列中检测到3120个推定的简单序列重复(SSR),其包含二核苷酸至六核苷酸重复基序。已鉴定出SSR位点的PCR引物列表,用于研究遗传多态性。转录组数据是进一步研究该物种基因组结构和生物学的宝贵资源。单基因信息和微卫星将为该物种在自然和适应性环境中的保护提供不可或缺工具。