• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用最小化器对转座元件进行基于同源性的高效注释。

Efficient homology-based annotation of transposable elements using minimizers.

作者信息

Gonzalez-García Laura Natalia, Lozano-Arce Daniela, Londoño Juan Pablo, Guyot Romain, Duitama Jorge

机构信息

Systems and Computing Engineering Department Universidad de los Andes Bogotá Colombia.

UMR DIADE, Institut de Recherche pour le Développement Université de Montpellier, CIRAD 34394 Montpellier France.

出版信息

Appl Plant Sci. 2023 May 11;11(4):e11520. doi: 10.1002/aps3.11520. eCollection 2023 Jul-Aug.

DOI:10.1002/aps3.11520
PMID:37601317
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10439823/
Abstract

PREMISE

Transposable elements (TEs) make up more than half of the genomes of complex plant species and can modulate the expression of neighboring genes, producing significant variability of agronomically relevant traits. The availability of long-read sequencing technologies allows the building of genome assemblies for plant species with large and complex genomes. Unfortunately, TE annotation currently represents a bottleneck in the annotation of genome assemblies.

METHODS AND RESULTS

We present a new functionality of the Next-Generation Sequencing Experience Platform (NGSEP) to perform efficient homology-based TE annotation. Sequences in a reference library are treated as long reads and mapped to an input genome assembly. A hierarchical annotation is then assigned by homology using the annotation of the reference library. We tested the performance of our algorithm on genome assemblies of different plant species, including , , and (bread wheat). Our algorithm outperforms traditional homology-based annotation tools in speed by a factor of three to >20, reducing the annotation time of the genome from months to hours, and recovering up to 80% of TEs annotated with RepeatMasker with a precision of up to 0.95.

CONCLUSIONS

NGSEP allows rapid analysis of TEs, especially in very large and TE-rich plant genomes.

摘要

前提

转座元件(TEs)在复杂植物物种的基因组中所占比例超过一半,并且可以调节邻近基因的表达,从而产生与农艺相关性状的显著变异性。长读长测序技术的出现使得构建具有大而复杂基因组的植物物种的基因组组装成为可能。不幸的是,目前TE注释是基因组组装注释中的一个瓶颈。

方法与结果

我们展示了新一代测序体验平台(NGSEP)的一项新功能,用于高效地基于同源性进行TE注释。参考文库中的序列被视为长读长,并映射到输入的基因组组装上。然后使用参考文库的注释通过同源性进行分层注释。我们在不同植物物种的基因组组装上测试了我们算法的性能,包括普通小麦、硬粒小麦和面包小麦。我们的算法在速度上比传统的基于同源性的注释工具快3到20倍,将面包小麦基因组的注释时间从数月缩短到数小时,并以高达0.95的精度找回了高达80%用RepeatMasker注释的TEs。

结论

NGSEP允许对TEs进行快速分析,特别是在非常大且富含TE的植物基因组中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/ecaba1cc4388/APS3-11-e11520-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/0b1453e0c69c/APS3-11-e11520-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/3550a0fdb234/APS3-11-e11520-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/f2b564ac50f0/APS3-11-e11520-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/4fe1486fd1c1/APS3-11-e11520-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/33ef06c7023e/APS3-11-e11520-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/ecaba1cc4388/APS3-11-e11520-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/0b1453e0c69c/APS3-11-e11520-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/3550a0fdb234/APS3-11-e11520-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/f2b564ac50f0/APS3-11-e11520-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/4fe1486fd1c1/APS3-11-e11520-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/33ef06c7023e/APS3-11-e11520-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4495/10439823/ecaba1cc4388/APS3-11-e11520-g006.jpg

相似文献

1
Efficient homology-based annotation of transposable elements using minimizers.使用最小化器对转座元件进行基于同源性的高效注释。
Appl Plant Sci. 2023 May 11;11(4):e11520. doi: 10.1002/aps3.11520. eCollection 2023 Jul-Aug.
2
Transposable element annotation of the rice genome.水稻基因组的转座元件注释
Bioinformatics. 2004 Jan 22;20(2):155-60. doi: 10.1093/bioinformatics/bth019.
3
Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation.直翅目转座元件文库(Orthoptera-TElib):用于转座元件注释的直翅目转座元件文库。
Mob DNA. 2024 Mar 15;15(1):5. doi: 10.1186/s13100-024-00316-x.
4
A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea.转座元件注释流水线和表达分析揭示微藻新月菱形藻中潜在活跃的元件。
BMC Genomics. 2018 May 22;19(1):378. doi: 10.1186/s12864-018-4763-1.
5
Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline.针对可转座元件注释方法进行基准测试,以创建简化、全面的流水线。
Genome Biol. 2019 Dec 16;20(1):275. doi: 10.1186/s13059-019-1905-y.
6
Long-Read cDNA Sequencing Enables a "Gene-Like" Transcript Annotation of Transposable Elements.长读 cDNA 测序实现转座元件的“基因样”转录本注释。
Plant Cell. 2020 Sep;32(9):2687-2698. doi: 10.1105/tpc.20.00115. Epub 2020 Jul 9.
7
Combined evidence annotation of transposable elements in genome sequences.基因组序列中转座元件的联合证据注释
PLoS Comput Biol. 2005 Jul;1(2):166-75. doi: 10.1371/journal.pcbi.0010022. Epub 2005 Jul 29.
8
Preparation of Non-overlapping Transposable Elements (TEs) Annotation by Interval Tree.利用区间树制备非重叠转座元件(TEs)注释。
Methods Mol Biol. 2022;2509:353-360. doi: 10.1007/978-1-0716-2380-0_21.
9
Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies.准确的转座元件注释对于分析新的基因组组装至关重要。
Genome Biol Evol. 2016 Jan 21;8(2):403-10. doi: 10.1093/gbe/evw009.
10
Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.真核生物基因组中转座元件的生物信息学和基因组分析。
Chromosome Res. 2011 Aug;19(6):787-808. doi: 10.1007/s10577-011-9230-7.

引用本文的文献

1
Genome-wide association study of cassava brown streak disease resistance in cassava germplasm conserved in South America.在南美的木薯种质资源中进行木薯褐条病抗性的全基因组关联研究。
Sci Rep. 2024 Oct 4;14(1):23141. doi: 10.1038/s41598-024-74161-6.
2
A phased genome assembly of a Colombian Trypanosoma cruzi TcI strain and the evolution of gene families.哥伦比亚克氏锥虫 TcI 株的分步基因组组装和基因家族进化。
Sci Rep. 2024 Jan 24;14(1):2054. doi: 10.1038/s41598-024-52449-x.

本文引用的文献

1
Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes.Inpactor2:一款基于深度学习的软件,用于鉴定和分类植物基因组中的 LTR 反转录转座子。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac511.
2
NGSEP 4: Efficient and accurate identification of orthogroups and whole-genome alignment.NGSEP 4:直系同源组的高效准确识别与全基因组比对
Mol Ecol Resour. 2023 Apr;23(3):712-724. doi: 10.1111/1755-0998.13737. Epub 2022 Nov 27.
3
TransposonUltimate: software for transposon classification, annotation and detection.
转座子终极分类注释检测软件
Nucleic Acids Res. 2022 Jun 24;50(11):e64. doi: 10.1093/nar/gkac136.
4
Green plant genomes: What we know in an era of rapidly expanding opportunities.绿色植物基因组:在机遇迅速扩展的时代,我们所知道的。
Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2115640118.
5
Plant Metabolic Gene Clusters: Evolution, Organization, and Their Applications in Synthetic Biology.植物代谢基因簇:进化、组织及其在合成生物学中的应用
Front Plant Sci. 2021 Aug 13;12:697318. doi: 10.3389/fpls.2021.697318. eCollection 2021.
6
Technology dictates algorithms: recent developments in read alignment.技术决定算法:读段比对的最新进展。
Genome Biol. 2021 Aug 26;22(1):249. doi: 10.1186/s13059-021-02443-7.
7
Optical maps refine the bread wheat Triticum aestivum cv. Chinese Spring genome assembly.光学图谱精修小麦中国春品种基因组组装。
Plant J. 2021 Jul;107(1):303-314. doi: 10.1111/tpj.15289. Epub 2021 May 16.
8
The absence of the caffeine synthase gene is involved in the naturally decaffeinated status of Coffea humblotiana, a wild species from Comoro archipelago.咖啡因合酶基因的缺失与 Comoro 群岛野生物种 Coffea humblotiana 的天然脱咖啡因状态有关。
Sci Rep. 2021 Apr 14;11(1):8119. doi: 10.1038/s41598-021-87419-0.
9
InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning.InpactorDB:一个基于机器学习的自由对齐方法的分类谱系水平植物 LTR 反转录转座子参考文库。
Genes (Basel). 2021 Jan 28;12(2):190. doi: 10.3390/genes12020190.
10
The Dfam community resource of transposable element families, sequence models, and genome annotations.转座元件家族、序列模型和基因组注释的Dfam社区资源。
Mob DNA. 2021 Jan 12;12(1):2. doi: 10.1186/s13100-020-00230-y.