Suppr超能文献

Inpactor2:一款基于深度学习的软件,用于鉴定和分类植物基因组中的 LTR 反转录转座子。

Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes.

机构信息

Department of Computer Science, Universidad Autónoma de Manizales, 170001, Caldas, Colombia.

Department of Systems and Informatics, Center for Technology Development - Bioprocess and Agro-industry Plant, Universidad de Caldas, 170004, Caldas, Colombia.

出版信息

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac511.

Abstract

LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools.

摘要

长末端重复转座子是植物基因组中最丰富的重复序列,在进化和生物多样性中发挥着重要作用。它们的特征对于理解它们的动态至关重要。然而,这些元素的鉴定和分类仍然是一个挑战。此外,当前的软件可能相对较慢(从几小时到几天),有时需要大量的人工工作,并且在精度和灵敏度方面达不到令人满意的水平。在这里,我们介绍了 Inpactor2,这是一种准确而快速的应用程序,可以在很短的时间内创建 LTR- retrotransposon 参考文库。Inpactor2 以组装的基因组为输入,采用混合方法(深度学习和基于结构的方法)来检测元素,过滤部分序列,最后将完整序列分类为超家族,并像很少有工具那样分类为谱系。该工具利用多核和 GPU 架构来减少执行时间。使用水稻基因组,Inpactor2 的运行时间为 5 分钟(比其他工具快),并且具有测试工具中最佳的准确性和 F1 分数,也仅次于 EDTA 具有第二高的准确性和特异性,但灵敏度提高了 28%。对于大型基因组,Inpactor2 的速度比其他可用的生物信息学工具快七倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d354/9851300/6d302b0cd891/bbac511f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验