Suppr超能文献

草履虫基因组学的改进方法与资源:转录单元、基因注释与基因表达

Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.

作者信息

Arnaiz Olivier, Van Dijk Erwin, Bétermier Mireille, Lhuillier-Akakpo Maoussi, de Vanssay Augustin, Duharcourt Sandra, Sallet Erika, Gouzy Jérôme, Sperling Linda

机构信息

Institute for Integrative Biology of the Cell (I2BC), CNRS, CEA, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette CEDEX, France.

Institut Jacques Monod, CNRS, UMR 7592, Université Paris Diderot, Sorbonne Paris Cité, F-75205, Paris, France.

出版信息

BMC Genomics. 2017 Jun 26;18(1):483. doi: 10.1186/s12864-017-3887-z.

Abstract

BACKGROUND

The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage.

RESULTS

We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource.

CONCLUSIONS

We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis regulatory motifs). The P. tetraurelia improved transcriptome resource, gene annotations for P. tetraurelia, P. biaurelia, P. sexaurelia and P. caudatum, and Paramecium-trained EuGene configuration are available through ParameciumDB ( http://paramecium.i2bc.paris-saclay.fr ). TrUC software is freely distributed under a GNU GPL v3 licence ( https://github.com/oarnaiz/TrUC ).

摘要

背景

草履虫隐秘种复合体的15个姊妹种是在数千万年前发生的一次全基因组复制之后出现的。鉴于上个世纪积累的关于草履虫遗传学和表观遗传学的广泛知识,这个物种复合体为研究单细胞真核生物全基因组复制的后果以及驱动物种形成的遗传和表观遗传机制提供了一个独特而强大的系统。高质量的草履虫基因模型对于使用该系统的研究很重要。本文报道的工作的主要目的是为草履虫谱系构建一个改进的基因注释流程。

结果

我们在模式物种四膜虫的自体受精有性过程中生成了定向RNA-Seq转录组数据。我们首次在纤毛虫中使用改进的Cap-Seq协议确定了候选的四膜虫转录起始位点。我们开发了TrUC,这是一个多线程的Perl软件,它与将RNA-Seq数据映射到参考基因组的TopHat相结合,为注释流程预测转录单元。我们使用EuGene软件来整合注释证据。为四膜虫获得的高质量基因结构注释被用作改进其他3种草履虫物种已发表注释的证据。RNA-Seq数据还用于差异基因表达分析,提供了一个比先前建立的微阵列资源更敏感的基因表达图谱。

结论

我们开发了一个针对草履虫物种紧凑基因组和微小内含子量身定制的基因注释流程。这个流程的一个新组件TrUC使用Cap-Seq和定向RNA-Seq数据预测转录单元。TrUC可能在草履虫之外也有用,特别是在基因密度高的情况下。对3'和5'非翻译区的准确预测对于基因表达研究(如核小体定位、顺式调控基序的鉴定)将特别有价值。通过草履虫数据库(http://paramecium.i2bc.paris-saclay.fr)可获得改进的四膜虫转录组资源、四膜虫、双小核草履虫、六小核草履虫和尾草履虫的基因注释以及经过草履虫训练的EuGene配置。TrUC软件根据GNU GPL v3许可免费分发(https://github.com/oarnaiz/TrUC)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/5485702/441480a29b6f/12864_2017_3887_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验