Department of Computer Science, East Carolina University, Greenville, NC, 27858, USA.
Division of Biology, Kansas State University, Manhattan, KS, 66506, USA.
BMC Genomics. 2020 Jan 14;21(1):47. doi: 10.1186/s12864-019-6394-6.
The red flour beetle Tribolium castaneum has emerged as an important model organism for the study of gene function in development and physiology, for ecological and evolutionary genomics, for pest control and a plethora of other topics. RNA interference (RNAi), transgenesis and genome editing are well established and the resources for genome-wide RNAi screening have become available in this model. All these techniques depend on a high quality genome assembly and precise gene models. However, the first version of the genome assembly was generated by Sanger sequencing, and with a small set of RNA sequence data limiting annotation quality.
Here, we present an improved genome assembly (Tcas5.2) and an enhanced genome annotation resulting in a new official gene set (OGS3) for Tribolium castaneum, which significantly increase the quality of the genomic resources. By adding large-distance jumping library DNA sequencing to join scaffolds and fill small gaps, the gaps in the genome assembly were reduced and the N50 increased to 4753kbp. The precision of the gene models was enhanced by the use of a large body of RNA-Seq reads of different life history stages and tissue types, leading to the discovery of 1452 novel gene sequences. We also added new features such as alternative splicing, well defined UTRs and microRNA target predictions. For quality control, 399 gene models were evaluated by manual inspection. The current gene set was submitted to Genbank and accepted as a RefSeq genome by NCBI.
The new genome assembly (Tcas5.2) and the official gene set (OGS3) provide enhanced genomic resources for genetic work in Tribolium castaneum. The much improved information on transcription start sites supports transgenic and gene editing approaches. Further, novel types of information such as splice variants and microRNA target genes open additional possibilities for analysis.
赤拟谷盗已成为研究发育和生理学中基因功能、生态和进化基因组学、害虫防治以及其他众多主题的重要模式生物。RNA 干扰(RNAi)、转基因和基因组编辑已经得到很好的确立,并且这个模型的全基因组 RNAi 筛选资源已经可用。所有这些技术都依赖于高质量的基因组组装和精确的基因模型。然而,第一个基因组组装版本是通过桑格测序生成的,并且由于一组有限的 RNA 序列数据限制了注释质量。
在这里,我们展示了一个改进的基因组组装(Tcas5.2)和一个增强的基因组注释,产生了赤拟谷盗的新官方基因集(OGS3),这显著提高了基因组资源的质量。通过添加远距离跳跃文库 DNA 测序来连接支架并填补小缺口,减少了基因组组装中的缺口,N50 增加到 4753kbp。通过使用不同生活史阶段和组织类型的大量 RNA-Seq 读数增强了基因模型的精度,从而发现了 1452 个新的基因序列。我们还添加了新的特征,如选择性剪接、定义明确的 UTR 和 microRNA 靶标预测。为了质量控制,通过手动检查评估了 399 个基因模型。当前的基因集已提交给 Genbank 并被 NCBI 接受为 RefSeq 基因组。
新的基因组组装(Tcas5.2)和官方基因集(OGS3)为赤拟谷盗的遗传工作提供了增强的基因组资源。转录起始位点的信息得到了极大的改善,支持了转基因和基因编辑方法。此外,新类型的信息,如剪接变体和 microRNA 靶基因,为分析开辟了更多的可能性。