Montero-Mendieta Santiago, Grabherr Manfred, Lantz Henrik, De la Riva Ignacio, Leonard Jennifer A, Webster Matthew T, Vilà Carles
Conservation and Evolutionary Genetics Group, Department of Integrative Ecology, Doñana Biological Station (EBD-CSIC), Consejo Superior de Investigaciones Científicas, Seville, Spain.
Department of Medical Biochemistry and Microbiology, National Bioinformatics Infrastructure Sweden (BILS), Uppsala Universitet, Uppsala, Sweden.
PeerJ. 2017 Sep 1;5:e3702. doi: 10.7717/peerj.3702. eCollection 2017.
Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the transcriptome must be assembled . We used RNA-seq to obtain the transcriptomic profile for , a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome of . We also present a pipeline to assist with pre-processing, assembling, evaluating and functionally annotating a transcriptome from RNA-seq data of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through all the necessary steps to build transcriptome assemblies using readily available software and is freely available at: https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-PIPELINE/wiki.
全基因组测序(WGS)是了解鲜为人知物种进化历史的一项非常有价值的资源。然而,对于大多数两栖动物这类基因组较大的生物而言,WGS仍然极具挑战性,而转录组测序(RNA-seq)则是探索全基因组变异性的一种经济高效的工具。非模式生物通常没有参考基因组,因此必须对转录组进行组装。我们利用RNA-seq获得了一种鲜为人知的南美直接发育蛙类的转录组图谱。总共组装了550,871个转录本,对应于422,999个推定基因。其中,我们分别在Pfam、EggNOG、KEGG和GO数据库中鉴定出23,500、37,349、38,120和45,885个基因。有趣的是,我们的结果表明,与免疫系统和防御机制相关的基因在该蛙类的转录组中很丰富。我们还展示了一个流程,用于辅助对非模式生物的RNA-seq数据进行预处理、组装、评估和功能注释其转录组。我们的流程以直观的方式引导缺乏经验的用户完成使用现成软件构建转录组组装的所有必要步骤,并且可在以下网址免费获取:https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-PIPELINE/wiki 。