Geng Xiaofang, Li Wanshun, Shang Haitao, Gou Qiang, Zhang Fuchun, Zang Xiayan, Zeng Benhua, Li Jiang, Wang Ying, Ma Ji, Guo Jianlin, Jian Jianbo, Chen Bing, Qiao Zhigang, Zhou Minghui, Wei Hong, Fang Xiaodong, Xu Cunshuan
State Key Laboratory Cultivation Base for Cell Differentiation Regulation, College of Life Science, Henan Normal University, Xinxiang 453007, Henan Province, China.
Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi 830046, China.
Gigascience. 2017 Mar 1;6(3):1-7. doi: 10.1093/gigascience/gix006.
Chinese giant salamander (CGS) is the largest extant amphibian species in the world. Owing to its evolutionary position and four peculiar phenomenon of life (longevity, starvation tolerance, regenerative ability, and hatch without sunshine), it is an invaluable model species for research. However, lack of genomic resources leads to fewer study progresses in these fields, due to its huge genome of ∼50 GB making it extremely difficult to be assembled.
We reported the sequenced transcriptome of more than 20 tissues from adult CGS using Illumina Hiseq 2000 technology, and a total of 93 366 no-redundancy transcripts with a mean length of 1326 bp were obtained. We developed for the first time an efficient pipeline to construct a high-quality reference gene set of CGS and obtained 26 135 coding genes. BUSCO and homologous assessment showed that our assembly captured 70.6% of vertebrate universal single-copy orthologs, and this coding gene set had a higher proportion of completeness CDS with comparable quality of the protein sets of Tibetan frog.
These highest quality data will provide a valuable reference gene set to the subsequent research of CGS. In addition, our strategy of de novo transcriptome assembly and protein identification is applicable to similar studies.
中国大鲵是世界上现存最大的两栖动物物种。由于其进化地位以及四种独特的生命现象(长寿、耐饥饿、再生能力和无光孵化),它是一种极具价值的研究模式物种。然而,由于其约50GB的庞大基因组极难组装,基因组资源的匮乏导致这些领域的研究进展较少。
我们使用Illumina Hiseq 2000技术报道了成年中国大鲵20多个组织的转录组测序结果,共获得93366条无冗余转录本,平均长度为1326bp。我们首次开发了一种高效流程来构建高质量的中国大鲵参考基因集,并获得了26135个编码基因。BUSCO和同源性评估表明,我们的组装捕获了70.6%的脊椎动物通用单拷贝直系同源基因,并且这个编码基因集具有更高比例的完整CDS,其蛋白质集质量与西藏蟾蜍相当。
这些最高质量的数据将为中国大鲵的后续研究提供有价值的参考基因集。此外,我们的从头转录组组装和蛋白质鉴定策略适用于类似研究。