• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Inpactor2:一款基于深度学习的软件,用于鉴定和分类植物基因组中的 LTR 反转录转座子。

Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes.

机构信息

Department of Computer Science, Universidad Autónoma de Manizales, 170001, Caldas, Colombia.

Department of Systems and Informatics, Center for Technology Development - Bioprocess and Agro-industry Plant, Universidad de Caldas, 170004, Caldas, Colombia.

出版信息

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac511.

DOI:10.1093/bib/bbac511
PMID:36502372
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9851300/
Abstract

LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools.

摘要

长末端重复转座子是植物基因组中最丰富的重复序列,在进化和生物多样性中发挥着重要作用。它们的特征对于理解它们的动态至关重要。然而,这些元素的鉴定和分类仍然是一个挑战。此外,当前的软件可能相对较慢(从几小时到几天),有时需要大量的人工工作,并且在精度和灵敏度方面达不到令人满意的水平。在这里,我们介绍了 Inpactor2,这是一种准确而快速的应用程序,可以在很短的时间内创建 LTR- retrotransposon 参考文库。Inpactor2 以组装的基因组为输入,采用混合方法(深度学习和基于结构的方法)来检测元素,过滤部分序列,最后将完整序列分类为超家族,并像很少有工具那样分类为谱系。该工具利用多核和 GPU 架构来减少执行时间。使用水稻基因组,Inpactor2 的运行时间为 5 分钟(比其他工具快),并且具有测试工具中最佳的准确性和 F1 分数,也仅次于 EDTA 具有第二高的准确性和特异性,但灵敏度提高了 28%。对于大型基因组,Inpactor2 的速度比其他可用的生物信息学工具快七倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/cf919861e5f4/bbac511f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/6d302b0cd891/bbac511f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/b23e5cc58449/bbac511f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/8a9a2f29edda/bbac511f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/60dc63204518/bbac511f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/8b554fbdc1ed/bbac511f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/cf919861e5f4/bbac511f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/6d302b0cd891/bbac511f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/b23e5cc58449/bbac511f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/8a9a2f29edda/bbac511f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/60dc63204518/bbac511f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/8b554fbdc1ed/bbac511f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/cf919861e5f4/bbac511f6.jpg

相似文献

1
Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes.Inpactor2:一款基于深度学习的软件,用于鉴定和分类植物基因组中的 LTR 反转录转座子。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac511.
2
Genome-wide characterization of LTR retrotransposons in the non-model deep-sea annelid Lamellibrachia luymesi.在非模式深海环节动物 Lamellibrachia luymesi 中进行 LTR 反转录转座子的全基因组特征分析。
BMC Genomics. 2021 Jun 23;22(1):466. doi: 10.1186/s12864-021-07749-1.
3
Evolutionary history of Oryza sativa LTR retrotransposons: a preliminary survey of the rice genome sequences.水稻LTR反转录转座子的进化史:对水稻基因组序列的初步调查
BMC Genomics. 2004 Mar 2;5(1):18. doi: 10.1186/1471-2164-5-18.
4
De novo identification of LTR retrotransposons in eukaryotic genomes.真核生物基因组中LTR反转录转座子的从头鉴定。
BMC Genomics. 2007 Apr 3;8:90. doi: 10.1186/1471-2164-8-90.
5
Automatic curation of LTR retrotransposon libraries from plant genomes through machine learning.通过机器学习自动构建植物基因组中的 LTR 反转录转座子文库。
J Integr Bioinform. 2022 Jul 12;19(3). doi: 10.1515/jib-2021-0036. eCollection 2022 Sep 1.
6
Mollusc genomes reveal variability in patterns of LTR-retrotransposons dynamics.软体动物基因组揭示了 LTR 反转录转座子动态模式的可变性。
BMC Genomics. 2018 Nov 15;19(1):821. doi: 10.1186/s12864-018-5200-1.
7
The landscape and structural diversity of LTR retrotransposons in Musa genome.香蕉基因组中LTR反转录转座子的景观和结构多样性。
Mol Genet Genomics. 2017 Oct;292(5):1051-1067. doi: 10.1007/s00438-017-1333-1. Epub 2017 Jun 10.
8
Long Terminal Repeat Retrotransposon Content in Eight Diploid Sunflower Species Inferred from Next-Generation Sequence Data.基于下一代测序数据推断的八个二倍体向日葵物种中的长末端重复反转录转座子含量
G3 (Bethesda). 2016 Aug 9;6(8):2299-308. doi: 10.1534/g3.116.029082.
9
Identification and characterization of genome-wide long terminal repeat retrotransposons provide an insight into elucidating the trait evolution of five Rhododendron species.鉴定和分析全基因组长末端重复反转录转座子有助于阐明五个杜鹃属物种的特征进化。
Plant Biol (Stuttg). 2023 Aug;25(5):813-828. doi: 10.1111/plb.13532. Epub 2023 May 15.
10
InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning.InpactorDB:一个基于机器学习的自由对齐方法的分类谱系水平植物 LTR 反转录转座子参考文库。
Genes (Basel). 2021 Jan 28;12(2):190. doi: 10.3390/genes12020190.

引用本文的文献

1
DANTE and DANTE_LTR: lineage-centric annotation pipelines for long terminal repeat retrotransposons in plant genomes.DANTE和DANTE_LTR:用于植物基因组中长末端重复逆转录转座子的以谱系为中心的注释管道。
NAR Genom Bioinform. 2024 Aug 29;6(3):lqae113. doi: 10.1093/nargab/lqae113. eCollection 2024 Sep.
2
De novo genome assembly of white clover (Trifolium repens L.) reveals the role of copy number variation in rapid environmental adaptation.白三叶草(Trifolium repens L.)的从头基因组组装揭示了拷贝数变异在快速环境适应中的作用。
BMC Biol. 2024 Aug 7;22(1):165. doi: 10.1186/s12915-024-01962-6.
3
Look4LTRs: a Long terminal repeat retrotransposon detection tool capable of cross species studies and discovering recently nested repeats.

本文引用的文献

1
Software evaluation for de novo detection of transposons.用于转座子从头检测的软件评估
Mob DNA. 2022 Apr 27;13(1):14. doi: 10.1186/s13100-022-00266-2.
2
Specificities and Dynamics of Transposable Elements in Land Plants.陆地植物中转座元件的特异性与动态变化
Biology (Basel). 2022 Mar 23;11(4):488. doi: 10.3390/biology11040488.
3
TransposonUltimate: software for transposon classification, annotation and detection.转座子终极分类注释检测软件
Look4LTRs:一种能够进行跨物种研究并发现近期嵌套重复序列的长末端重复逆转录转座子检测工具。
Mob DNA. 2024 Apr 16;15(1):8. doi: 10.1186/s13100-024-00317-w.
4
The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars.异源多倍体咖啡的基因组和群体基因组揭示了现代咖啡品种的多样化历史。
Nat Genet. 2024 Apr;56(4):721-731. doi: 10.1038/s41588-024-01695-w. Epub 2024 Apr 15.
5
From tradition to innovation: conventional and deep learning frameworks in genome annotation.从传统到创新:基因组注释中的常规和深度学习框架。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae138.
6
MegaLTR: a web server and standalone pipeline for detecting and annotating LTR-retrotransposons in plant genomes.MegaLTR:用于检测和注释植物基因组中LTR反转录转座子的网络服务器和独立管道。
Front Plant Sci. 2023 Sep 20;14:1237426. doi: 10.3389/fpls.2023.1237426. eCollection 2023.
7
Genomic object detection: An improved approach for transposable elements detection and classification using convolutional neural networks.基因组对象检测:一种使用卷积神经网络改进的转座元件检测和分类方法。
PLoS One. 2023 Sep 21;18(9):e0291925. doi: 10.1371/journal.pone.0291925. eCollection 2023.
8
Efficient homology-based annotation of transposable elements using minimizers.使用最小化器对转座元件进行基于同源性的高效注释。
Appl Plant Sci. 2023 May 11;11(4):e11520. doi: 10.1002/aps3.11520. eCollection 2023 Jul-Aug.
9
Selection signatures and population dynamics of transposable elements in lima bean.菜豆中转座元件的选择信号和种群动态。
Commun Biol. 2023 Aug 2;6(1):803. doi: 10.1038/s42003-023-05144-y.
Nucleic Acids Res. 2022 Jun 24;50(11):e64. doi: 10.1093/nar/gkac136.
4
A guide to machine learning for biologists.生物学机器学习指南。
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
5
A comprehensive annotation dataset of intact LTR retrotransposons of 300 plant genomes.300 种植物基因组完整 LTR 反转录转座子的综合注释数据集。
Sci Data. 2021 Jul 15;8(1):174. doi: 10.1038/s41597-021-00968-x.
6
-mer-based machine learning method to classify LTR-retrotransposons in plant genomes.基于-mer的机器学习方法对植物基因组中的LTR反转录转座子进行分类。
PeerJ. 2021 May 19;9:e11456. doi: 10.7717/peerj.11456. eCollection 2021.
7
TERL: classification of transposable elements by convolutional neural networks.TERL:基于卷积神经网络的转座元件分类。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa185.
8
The absence of the caffeine synthase gene is involved in the naturally decaffeinated status of Coffea humblotiana, a wild species from Comoro archipelago.咖啡因合酶基因的缺失与 Comoro 群岛野生物种 Coffea humblotiana 的天然脱咖啡因状态有关。
Sci Rep. 2021 Apr 14;11(1):8119. doi: 10.1038/s41598-021-87419-0.
9
ClassifyTE: a stacking-based prediction of hierarchical classification of transposable elements.ClassifyTE:一种基于堆叠的转座元件层次分类预测方法。
Bioinformatics. 2021 Sep 9;37(17):2529-2536. doi: 10.1093/bioinformatics/btab146.
10
InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning.InpactorDB:一个基于机器学习的自由对齐方法的分类谱系水平植物 LTR 反转录转座子参考文库。
Genes (Basel). 2021 Jan 28;12(2):190. doi: 10.3390/genes12020190.