Suppr超能文献

纤细裸藻基因组与转录组:细胞器、核基因组组装策略及初步特征

Euglena gracilis Genome and Transcriptome: Organelles, Nuclear Genome Assembly Strategies and Initial Features.

作者信息

Ebenezer ThankGod Echezona, Carrington Mark, Lebert Michael, Kelly Steven, Field Mark C

机构信息

Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK.

School of Life Sciences, University of Dundee, Dundee, DD1 5EH, UK.

出版信息

Adv Exp Med Biol. 2017;979:125-140. doi: 10.1007/978-3-319-54910-1_7.

Abstract

Euglena gracilis is a major component of the aquatic ecosystem and together with closely related species, is ubiquitous worldwide. Euglenoids are an important group of protists, possessing a secondarily acquired plastid and are relatives to the Kinetoplastidae, which themselves have global impact as disease agents. To understand the biology of E. gracilis, as well as to provide further insight into the evolution and origins of the Kinetoplastidae, we embarked on sequencing the nuclear genome; the plastid and mitochondrial genomes are already in the public domain. Earlier studies suggested an extensive nuclear DNA content, with likely a high degree of repetitive sequence, together with significant extrachromosomal elements. To produce a list of coding sequences we have combined transcriptome data from both published and new sources, as well as embarked on de novo sequencing using a combination of 454, Illumina paired end libraries and long PacBio reads. Preliminary analysis suggests a surprisingly large genome approaching 2 Gbp, with a highly fragmented architecture and extensive repeat composition. Over 80% of the RNAseq reads from E. gracilis maps to the assembled genome sequence, which is comparable with the well assembled genomes of T. brucei and T. cruzi. In order to achieve this level of assembly we employed multiple informatics pipelines, which are discussed here. Finally, as a preliminary view of the genome architecture, we discuss the tubulin and calmodulin genes, which highlight potential novel splicing mechanisms.

摘要

纤细裸藻是水生生态系统的主要组成部分,与亲缘关系密切的物种一起,在全球范围内广泛分布。裸藻是一类重要的原生生物,拥有次生获得的质体,是动质体科的近亲,而动质体科本身作为病原体具有全球影响。为了了解纤细裸藻的生物学特性,并进一步深入了解动质体科的进化和起源,我们着手对其核基因组进行测序;质体和线粒体基因组已公开。早期研究表明其核DNA含量丰富,可能存在高度重复序列,以及大量的染色体外元件。为了生成编码序列列表,我们整合了来自已发表和新来源的转录组数据,并使用454、Illumina双末端文库和长PacBio reads组合进行从头测序。初步分析表明,其基因组惊人地大,接近2 Gbp,具有高度碎片化的结构和广泛的重复组成。来自纤细裸藻的超过80%的RNAseq reads映射到组装好的基因组序列,这与布氏锥虫和克氏锥虫组装良好的基因组相当。为了达到这种组装水平,我们采用了多种信息学流程,本文将对此进行讨论。最后,作为对基因组结构的初步观察,我们讨论了微管蛋白和钙调蛋白基因,它们突出了潜在的新型剪接机制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验