• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因组对象检测:一种使用卷积神经网络改进的转座元件检测和分类方法。

Genomic object detection: An improved approach for transposable elements detection and classification using convolutional neural networks.

机构信息

Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Colombia.

Center for Technology Development Bioprocess and Agroindustry Plant, Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia.

出版信息

PLoS One. 2023 Sep 21;18(9):e0291925. doi: 10.1371/journal.pone.0291925. eCollection 2023.

DOI:10.1371/journal.pone.0291925
PMID:37733731
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10513252/
Abstract

Analysis of eukaryotic genomes requires the detection and classification of transposable elements (TEs), a crucial but complex and time-consuming task. To improve the performance of tools that accomplish these tasks, Machine Learning approaches (ML) that leverage computer resources, such as GPUs (Graphical Processing Unit) and multiple CPU (Central Processing Unit) cores, have been adopted. However, until now, the use of ML techniques has mostly been limited to classification of TEs. Herein, a detection-classification strategy (named YORO) based on convolutional neural networks is adapted from computer vision (YOLO) to genomics. This approach enables the detection of genomic objects through the prediction of the position, length, and classification in large DNA sequences such as fully sequenced genomes. As a proof of concept, the internal protein-coding domains of LTR-retrotransposons are used to train the proposed neural network. Precision, recall, accuracy, F1-score, execution times and time ratios, as well as several graphical representations were used as metrics to measure performance. These promising results open the door for a new generation of Deep Learning tools for genomics. YORO architecture is available at https://github.com/simonorozcoarias/YORO.

摘要

真核生物基因组分析需要检测和分类转座元件 (TEs),这是一项关键但复杂且耗时的任务。为了提高完成这些任务的工具的性能,已经采用了利用计算机资源(如 GPU 和多个 CPU 内核)的机器学习方法 (ML)。然而,到目前为止,ML 技术的使用主要限于 TEs 的分类。在此,从计算机视觉 (YOLO) 中采用了一种基于卷积神经网络的检测-分类策略 (命名为 YORO) 来应用于基因组学。该方法通过在大型 DNA 序列(如全序列基因组)中预测位置、长度和分类,实现了对基因组对象的检测。作为概念验证,使用 LTR 逆转录转座子的内部蛋白编码结构域来训练所提出的神经网络。精度、召回率、准确性、F1 分数、执行时间和时间比以及几个图形表示形式被用作衡量性能的指标。这些有希望的结果为新一代基因组学深度学习工具开辟了道路。YORO 架构可在 https://github.com/simonorozcoarias/YORO 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/a92b0c5064c2/pone.0291925.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/f0dbc29f1fb8/pone.0291925.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/5262b673ddaf/pone.0291925.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/8c23eb914822/pone.0291925.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/9f86428ef94c/pone.0291925.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/a32f68a106f8/pone.0291925.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/60eb1b182fa1/pone.0291925.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/a92b0c5064c2/pone.0291925.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/f0dbc29f1fb8/pone.0291925.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/5262b673ddaf/pone.0291925.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/8c23eb914822/pone.0291925.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/9f86428ef94c/pone.0291925.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/a32f68a106f8/pone.0291925.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/60eb1b182fa1/pone.0291925.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402b/10513252/a92b0c5064c2/pone.0291925.g007.jpg

相似文献

1
Genomic object detection: An improved approach for transposable elements detection and classification using convolutional neural networks.基因组对象检测:一种使用卷积神经网络改进的转座元件检测和分类方法。
PLoS One. 2023 Sep 21;18(9):e0291925. doi: 10.1371/journal.pone.0291925. eCollection 2023.
2
DeepTE: a computational method for de novo classification of transposons with convolutional neural network.DeepTE:一种基于卷积神经网络的转座子从头分类计算方法。
Bioinformatics. 2020 Aug 1;36(15):4269-4275. doi: 10.1093/bioinformatics/btaa519.
3
TERL: classification of transposable elements by convolutional neural networks.TERL:基于卷积神经网络的转座元件分类。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa185.
4
InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning.InpactorDB:一个基于机器学习的自由对齐方法的分类谱系水平植物 LTR 反转录转座子参考文库。
Genes (Basel). 2021 Jan 28;12(2):190. doi: 10.3390/genes12020190.
5
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
6
Detection and classification of long terminal repeat sequences in plant LTR-retrotransposons and their analysis using explainable machine learning.植物LTR反转录转座子中长末端重复序列的检测、分类及其可解释机器学习分析
BioData Min. 2024 Dec 18;17(1):57. doi: 10.1186/s13040-024-00410-z.
7
MiteFinderII: a novel tool to identify miniature inverted-repeat transposable elements hidden in eukaryotic genomes.MiteFinderII:一种识别隐藏在真核生物基因组中的微型反向重复转座元件的新型工具。
BMC Med Genomics. 2018 Nov 20;11(Suppl 5):101. doi: 10.1186/s12920-018-0418-y.
8
Inpactor, Integrated and Parallel Analyzer and Classifier of LTR Retrotransposons and Its Application for Pineapple LTR Retrotransposons Diversity and Dynamics.长末端重复序列反转录转座子的冲击器、集成并行分析器和分类器及其在菠萝长末端重复序列反转录转座子多样性和动态研究中的应用
Biology (Basel). 2018 May 25;7(2):32. doi: 10.3390/biology7020032.
9
RepBox: a toolbox for the identification of repetitive elements.RepBox:一个用于识别重复元件的工具箱。
BMC Bioinformatics. 2023 Aug 22;24(1):317. doi: 10.1186/s12859-023-05419-5.
10
ENNGene: an Easy Neural Network model building tool for Genomics.ENNGene:用于基因组学的易于使用的神经网络模型构建工具。
BMC Genomics. 2022 Mar 31;23(1):248. doi: 10.1186/s12864-022-08414-x.

引用本文的文献

1
Teaching transposon classification as a means to crowd source the curation of repeat annotation - a tardigrade perspective.将转座子分类作为众包重复序列注释整理的一种手段进行教学——以缓步动物为例。
Mob DNA. 2024 May 6;15(1):10. doi: 10.1186/s13100-024-00319-8.

本文引用的文献

1
Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes.Inpactor2:一款基于深度学习的软件,用于鉴定和分类植物基因组中的 LTR 反转录转座子。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac511.
2
A beginner's guide to manual curation of transposable elements.转座元件人工筛选入门指南。
Mob DNA. 2022 Mar 30;13(1):7. doi: 10.1186/s13100-021-00259-7.
3
TransposonUltimate: software for transposon classification, annotation and detection.转座子终极分类注释检测软件
Nucleic Acids Res. 2022 Jun 24;50(11):e64. doi: 10.1093/nar/gkac136.
4
A comprehensive annotation dataset of intact LTR retrotransposons of 300 plant genomes.300 种植物基因组完整 LTR 反转录转座子的综合注释数据集。
Sci Data. 2021 Jul 15;8(1):174. doi: 10.1038/s41597-021-00968-x.
5
-mer-based machine learning method to classify LTR-retrotransposons in plant genomes.基于-mer的机器学习方法对植物基因组中的LTR反转录转座子进行分类。
PeerJ. 2021 May 19;9:e11456. doi: 10.7717/peerj.11456. eCollection 2021.
6
TERL: classification of transposable elements by convolutional neural networks.TERL:基于卷积神经网络的转座元件分类。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa185.
7
ClassifyTE: a stacking-based prediction of hierarchical classification of transposable elements.ClassifyTE:一种基于堆叠的转座元件层次分类预测方法。
Bioinformatics. 2021 Sep 9;37(17):2529-2536. doi: 10.1093/bioinformatics/btab146.
8
InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning.InpactorDB:一个基于机器学习的自由对齐方法的分类谱系水平植物 LTR 反转录转座子参考文库。
Genes (Basel). 2021 Jan 28;12(2):190. doi: 10.3390/genes12020190.
9
Nanopore RNA Sequencing Revealed Long Non-Coding and LTR Retrotransposon-Related RNAs Expressed at Early Stages of Triticale SEED Development.纳米孔RNA测序揭示了在小黑麦种子发育早期表达的长链非编码RNA和与LTR反转录转座子相关的RNA。
Plants (Basel). 2020 Dec 17;9(12):1794. doi: 10.3390/plants9121794.
10
keras_dna: a wrapper for fast implementation of deep learning models in genomics.keras_dna:用于在基因组学中快速实现深度学习模型的包装器。
Bioinformatics. 2021 Jul 12;37(11):1593-1594. doi: 10.1093/bioinformatics/btaa929.