• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从基因组组装中进行 TE 和片段重复序列的超先验鉴定、注释和特征描述。

Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.

机构信息

School of Biological Sciences, The University of Adelaide, Adelaide, SA 5005, Australia.

Evolutionary Biology Unit, South Australian Museum, Adelaide, SA 5005, Australia.

出版信息

PLoS One. 2018 Mar 14;13(3):e0193588. doi: 10.1371/journal.pone.0193588. eCollection 2018.

DOI:10.1371/journal.pone.0193588
PMID:29538441
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5851578/
Abstract

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.

摘要

转座元件 (TEs) 是可移动的 DNA 序列,它们构成了羊膜动物基因组的重要部分。然而,由于它们的可变特征、长度和特定分支的变体,它们很难从头开始检测和注释。我们通过改进和开发一个全面的从头开始重复管道 (CARP) 来解决这个问题,以识别和聚类基因组组装中的 TEs 和其他重复序列。该管道首先使用自定义比对器 krishna 进行两两比对。然后进行单链接聚类,以产生重复元件家族。然后对保守序列进行筛选,去除蛋白质编码基因,然后使用 Repbase 和逆转录酶序列的自定义库进行注释。这个过程产生了三种类型的家族:完全注释、部分注释和未注释。完全注释的家族反映了 Repbase 中最近分化/年轻的已知 TEs。其余两种类型的家族包含新的 TEs 和片段重复。通过将这些共识序列回Align 到基因组,评估拷贝数与长度分布,可以解决这些问题。与其他从头开始重复识别方法相比,我们的管道有三个显著的优势:1)我们不仅生成共识序列,而且保留原始对齐序列的基因组间隔,允许对进化动态进行直接分析,2)共识序列代表低分化、最近/当前活跃的 TE 家族,3)片段重复被注释为有用的副产品。我们比较了 7 个基因组组装的从头开始重复注释与其他方法,并证明 CARP 与 RepeatModeler 相比具有优势,RepeatModeler 是最广泛使用的重复注释包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/796eaae71c41/pone.0193588.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/ebc069090662/pone.0193588.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/90115b5de567/pone.0193588.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/1b9625a4b126/pone.0193588.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/796eaae71c41/pone.0193588.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/ebc069090662/pone.0193588.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/90115b5de567/pone.0193588.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/1b9625a4b126/pone.0193588.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6495/5851578/796eaae71c41/pone.0193588.g004.jpg

相似文献

1
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.从基因组组装中进行 TE 和片段重复序列的超先验鉴定、注释和特征描述。
PLoS One. 2018 Mar 14;13(3):e0193588. doi: 10.1371/journal.pone.0193588. eCollection 2018.
2
RepeatModeler2 for automated genomic discovery of transposable element families.RepeatModeler2 用于自动发现转座元件家族的基因组。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457. doi: 10.1073/pnas.1921046117. Epub 2020 Apr 16.
3
A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea.转座元件注释流水线和表达分析揭示微藻新月菱形藻中潜在活跃的元件。
BMC Genomics. 2018 May 22;19(1):378. doi: 10.1186/s12864-018-4763-1.
4
Genomic re-assessment of the transposable element landscape of the potato genome.马铃薯基因组转座元件景观的基因组再评估。
Plant Cell Rep. 2020 Sep;39(9):1161-1174. doi: 10.1007/s00299-020-02554-8. Epub 2020 May 20.
5
Transposable element annotation of the rice genome.水稻基因组的转座元件注释
Bioinformatics. 2004 Jan 22;20(2):155-60. doi: 10.1093/bioinformatics/bth019.
6
Transposable elements in reptilian and avian (sauropsida) genomes.爬行动物和鸟类(蜥形纲)基因组中的转座元件。
Cytogenet Genome Res. 2009;127(2-4):94-111. doi: 10.1159/000294999. Epub 2010 Mar 6.
7
Combined analysis of transposable elements and structural variation in maize genomes reveals genome contraction outpaces expansion.转座元件与玉米基因组结构变异的综合分析表明,基因组的收缩速度超过了扩张速度。
PLoS Genet. 2023 Dec 22;19(12):e1011086. doi: 10.1371/journal.pgen.1011086. eCollection 2023 Dec.
8
[Computational approaches for identification and classification of transposable elements in eukaryotic genomes].[真核生物基因组中转座元件鉴定与分类的计算方法]
Yi Chuan. 2012 Aug;34(8):1009-19. doi: 10.3724/sp.j.1005.2012.01009.
9
Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies.准确的转座元件注释对于分析新的基因组组装至关重要。
Genome Biol Evol. 2016 Jan 21;8(2):403-10. doi: 10.1093/gbe/evw009.
10
Differential Conservation and Loss of Chicken Repeat 1 (CR1) Retrotransposons in Squamates Reveal Lineage-Specific Genome Dynamics Across Reptiles.蜥蜴目中鸡重复序列 1(CR1)反转录转座子的差异保守性和缺失揭示了爬行动物中特定谱系的基因组动态。
Genome Biol Evol. 2024 Aug 5;16(8). doi: 10.1093/gbe/evae157.

引用本文的文献

1
A Concise Guide for the Characterization and Curation of Transposable Elements in Insect Genomes.昆虫基因组中转座元件的表征与整理简明指南
Methods Mol Biol. 2025;2935:109-124. doi: 10.1007/978-1-0716-4583-3_5.
2
Teaching transposon classification as a means to crowd source the curation of repeat annotation - a tardigrade perspective.将转座子分类作为众包重复序列注释整理的一种手段进行教学——以缓步动物为例。
Mob DNA. 2024 May 6;15(1):10. doi: 10.1186/s13100-024-00319-8.
3
Multiple horizontal transfer events of a DNA transposon into turtles, fishes, and a frog.

本文引用的文献

1
LINEs between Species: Evolutionary Dynamics of LINE-1 Retrotransposons across the Eukaryotic Tree of Life.物种间的LINEs:真核生物生命之树上LINE-1反转录转座子的进化动力学
Genome Biol Evol. 2016 Dec 14;8(11):3301-3322. doi: 10.1093/gbe/evw243.
2
Red: an intelligent, rapid, accurate tool for detecting repeats de-novo on the genomic scale.Red:一种用于在基因组规模上从头检测重复序列的智能、快速且准确的工具。
BMC Bioinformatics. 2015 Jul 24;16:227. doi: 10.1186/s12859-015-0654-5.
3
Transposable elements and genome size variations in plants.
一个DNA转座子多次水平转移至海龟、鱼类和一只青蛙体内。
Mob DNA. 2024 Apr 11;15(1):7. doi: 10.1186/s13100-024-00318-9.
4
Whole-genome sequencing and evolutionary analysis of the wild edible mushroom, .野生可食用蘑菇的全基因组测序与进化分析
Front Microbiol. 2024 Feb 1;14:1309703. doi: 10.3389/fmicb.2023.1309703. eCollection 2023.
5
Optimizing Trilobatin Production via Screening and Modification of Glycosyltransferases.通过筛选和修饰糖基转移酶来优化三叶海棠苷的生产。
Molecules. 2024 Jan 30;29(3):643. doi: 10.3390/molecules29030643.
6
Genome assembly composition of the String "ACGT" array: a review of data structure accuracy and performance challenges.字符串“ACGT”阵列的基因组组装组成:数据结构准确性和性能挑战综述
PeerJ Comput Sci. 2023 Jul 13;9:e1180. doi: 10.7717/peerj-cs.1180. eCollection 2023.
7
Genomic, transcriptomic, and epigenomic analysis of a medicinal snake, , to provides insights into the origin of Elapidae neurotoxins.对一种药用蛇进行基因组、转录组和表观基因组分析,以深入了解眼镜蛇科神经毒素的起源。
Acta Pharm Sin B. 2023 May;13(5):2234-2249. doi: 10.1016/j.apsb.2022.11.015. Epub 2022 Nov 17.
8
Chromosome-Level Genome Assembly of the Speckled Blue Grouper () Provides Insight into Its Adaptive Evolution.斜带石斑鱼的染色体水平基因组组装为其适应性进化提供了见解。
Biology (Basel). 2022 Dec 13;11(12):1810. doi: 10.3390/biology11121810.
9
Methodologies for the Discovery of Transposable Element Families.转座元件家族发现方法学
Genes (Basel). 2022 Apr 17;13(4):709. doi: 10.3390/genes13040709.
10
Accumulation and ineffective silencing of transposable elements on an avian W Chromosome.转座元件在鸟类 W 染色体上的积累和无效沉默。
Genome Res. 2022 Apr;32(4):671-681. doi: 10.1101/gr.275465.121. Epub 2022 Feb 11.
植物中的转座元件与基因组大小变异
Genomics Inform. 2014 Sep;12(3):87-97. doi: 10.5808/GI.2014.12.3.87. Epub 2014 Sep 30.
4
DNA transposon-based gene vehicles - scenes from an evolutionary drive.基于 DNA 转座子的基因载体——进化驱动力的场景。
J Biomed Sci. 2013 Dec 9;20(1):92. doi: 10.1186/1423-0127-20-92.
5
The genome of the green anole lizard and a comparative analysis with birds and mammals.绿色鬣蜥的基因组与鸟类和哺乳动物的比较分析。
Nature. 2011 Aug 31;477(7366):587-91. doi: 10.1038/nature10390.
6
HMMER web server: interactive sequence similarity searching.HMMER 网页服务器:交互式序列相似性搜索。
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W29-37. doi: 10.1093/nar/gkr367. Epub 2011 May 18.
7
Considering transposable element diversification in de novo annotation approaches.考虑从头注释方法中转座元件的多样化。
PLoS One. 2011 Jan 31;6(1):e16526. doi: 10.1371/journal.pone.0016526.
8
DNA transposons: nature and applications in genomics.DNA 转座子:在基因组学中的性质和应用。
Curr Genomics. 2010 Apr;11(2):115-28. doi: 10.2174/138920210790886871.
9
Search and clustering orders of magnitude faster than BLAST.比 BLAST 快几个数量级的搜索和聚类。
Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12.
10
BEDTools: a flexible suite of utilities for comparing genomic features.BEDTools:一套灵活的基因组特征比较工具套件。
Bioinformatics. 2010 Mar 15;26(6):841-2. doi: 10.1093/bioinformatics/btq033. Epub 2010 Jan 28.