Suppr超能文献

利用多个参考序列集改进测序基因组中转座元件的检测和注释。

Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets.

作者信息

Buisine Nicolas, Quesneville Hadi, Colot Vincent

机构信息

Unité de Recherche en Génomique Végétale, INRA UMR1165-CNRS UMR8114-Université d'Evry Val d'Essonne, 2 rue Gaston Crémieux, 91057 Evry, France.

出版信息

Genomics. 2008 May;91(5):467-75. doi: 10.1016/j.ygeno.2008.01.005. Epub 2008 Mar 14.

Abstract

Transposable elements (TEs) are ubiquitous components of eukaryotic genomes that impact many aspects of genome function. TE detection in genomic sequences is typically performed using similarity searches against a set of reference sequences built from previously identified TEs. Here, we demonstrate that this process can be improved by designing reference sets that incorporate key aspects of the structure and evolution of TEs and by combining these sets with Repbase Update (RU), which is composed mainly of consensus sequences. Using the Arabidopsis genome as a test case, our approach leads to the detection of an extra 12.4% of TE sequences. These correspond to novel TE fragments as well as to the extension of TE fragments already detected by RU. Significantly, we find that TE detection could be readily optimized using only two reference sets, one containing true consensus sequences and the other mosaic sequences that capture the structural diversity of TE copies within a family.

摘要

转座元件(TEs)是真核生物基因组中普遍存在的组成部分,会影响基因组功能的许多方面。基因组序列中的TE检测通常是通过与一组基于先前鉴定的TE构建的参考序列进行相似性搜索来进行的。在此,我们证明,通过设计纳入TE结构和进化关键方面的参考集,并将这些集与主要由共有序列组成的Repbase Update(RU)相结合,这一过程可以得到改进。以拟南芥基因组作为测试案例,我们的方法使得检测到的TE序列额外增加了12.4%。这些对应于新的TE片段以及RU已经检测到的TE片段的延伸。值得注意的是,我们发现仅使用两个参考集就能轻松优化TE检测,一个包含真实共有序列,另一个包含捕获家族内TE拷贝结构多样性的嵌合序列。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验