通过集成分类器改进tRNAscan-SE注释结果

Improving tRNAscan-SE Annotation Results via Ensemble Classifiers.

作者信息

Zou Quan, Guo Jiasheng, Ju Ying, Wu Meihong, Zeng Xiangxiang, Hong Zhiling

机构信息

School of Information Science and Technology, Xiamen University, Xiamen 361005, China.

School of Computer Science and Technology, Tianjin University, Tianjin 300072, China.

出版信息

Mol Inform. 2015 Nov;34(11-12):761-70. doi: 10.1002/minf.201500031. Epub 2015 Sep 14.

DOI:10.1002/minf.201500031

PMID:27491037

Abstract

tRNAScan-SE is a tRNA detection program that is widely used for tRNA annotation; however, the false positive rate of tRNAScan-SE is unacceptable for large sequences. Here, we used a machine learning method to try to improve the tRNAScan-SE results. A new predictor, tRNA-Predict, was designed. We obtained real and pseudo-tRNA sequences as training data sets using tRNAScan-SE and constructed three different tRNA feature sets. We then set up an ensemble classifier, LibMutil, to predict tRNAs from the training data. The positive data set of 623 tRNA sequences was obtained from tRNAdb 2009 and the negative data set was the false positive tRNAs predicted by tRNAscan-SE. Our in silico experiments revealed a prediction accuracy rate of 95.1 % for tRNA-Predict using 10-fold cross-validation. tRNA-Predict was developed to distinguish functional tRNAs from pseudo-tRNAs rather than to predict tRNAs from a genome-wide scan. However, tRNA-Predict can work with the output of tRNAscan-SE, which is a genome-wide scanning method, to improve the tRNAscan-SE annotation results. The tRNA-Predict web server is accessible at http://datamining.xmu.edu.cn/∼gjs/tRNA-Predict.

摘要

tRNAScan-SE是一个广泛用于tRNA注释的tRNA检测程序；然而，对于大型序列而言，tRNAScan-SE的假阳性率令人无法接受。在此，我们使用一种机器学习方法来尝试改进tRNAScan-SE的结果。设计了一种新的预测器tRNA-Predict。我们使用tRNAScan-SE获得真实和伪tRNA序列作为训练数据集，并构建了三种不同的tRNA特征集。然后我们建立了一个集成分类器LibMutil，用于从训练数据中预测tRNA。623个tRNA序列的阳性数据集取自tRNAdb 2009，阴性数据集是tRNAscan-SE预测的假阳性tRNA。我们的计算机模拟实验显示，使用10折交叉验证时，tRNA-Predict的预测准确率为95.1%。开发tRNA-Predict是为了区分功能性tRNA和伪tRNA，而不是从全基因组扫描中预测tRNA。然而，tRNA-Predict可以与作为全基因组扫描方法的tRNAscan-SE的输出配合使用，以改进tRNAscan-SE的注释结果。可通过http://datamining.xmu.edu.cn/∼gjs/tRNA-Predict访问tRNA-Predict网络服务器。

相似文献

Improving tRNAscan-SE Annotation Results via Ensemble Classifiers.

Mol Inform. 2015 Nov;34(11-12):761-70. doi: 10.1002/minf.201500031. Epub 2015 Sep 14.

tRNA-DL: A Deep Learning Approach to Improve tRNAscan-SE Prediction Results.

Hum Hered. 2018;83(3):163-172. doi: 10.1159/000493215. Epub 2019 Jan 25.

tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes.

Nucleic Acids Res. 2016 Jul 8;44(W1):W54-7. doi: 10.1093/nar/gkw413. Epub 2016 May 12.

tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes.

Nucleic Acids Res. 2021 Sep 20;49(16):9077-9096. doi: 10.1093/nar/gkab688.

tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

Nucleic Acids Res. 1997 Mar 1;25(5):955-64. doi: 10.1093/nar/25.5.955.

The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs.

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W686-9. doi: 10.1093/nar/gki366.

tRNADB-CE: tRNA gene database curated manually by experts.

Nucleic Acids Res. 2009 Jan;37(Database issue):D163-8. doi: 10.1093/nar/gkn692. Epub 2008 Oct 8.

On distinguishing between canonical tRNA genes and tRNA gene fragments in prokaryotes.

RNA Biol. 2023 Jan;20(1):48-58. doi: 10.1080/15476286.2023.2172370.

SPLITS: a new program for predicting split and intron-containing tRNA genes at the genome level.

In Silico Biol. 2006;6(5):411-8.

TFAM 1.0: an online tRNA function classifier.

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W350-3. doi: 10.1093/nar/gkm393. Epub 2007 Jun 25.

引用本文的文献

The complete mitochondrial genome of indicates structural dynamics and sequence divergences in Poaceae family.

Front Plant Sci. 2025 May 30;16:1589847. doi: 10.3389/fpls.2025.1589847. eCollection 2025.

Analysis of the mitochondrial genome of Neuroctenus hainanensis and the phylogenetic position of Aradoidea.

Sci Rep. 2025 Mar 20;15(1):9632. doi: 10.1038/s41598-025-93375-w.

Biological Sequence Classification: A Review on Data and General Methods.

Research (Wash D C). 2022 Dec 19;2022:0011. doi: 10.34133/research.0011. eCollection 2022.

Insights into structure, codon usage, repeats, and RNA editing of the complete mitochondrial genome of Perilla frutescens (Lamiaceae).

Sci Rep. 2024 Jun 17;14(1):13940. doi: 10.1038/s41598-024-64509-3.

Structural variant landscapes reveal convergent signatures of evolution in sheep and goats.

Genome Biol. 2024 Jun 6;25(1):148. doi: 10.1186/s13059-024-03288-6.

Databases and computational methods for the identification of piRNA-related molecules: A survey.

Comput Struct Biotechnol J. 2024 Jan 22;23:813-833. doi: 10.1016/j.csbj.2024.01.011. eCollection 2024 Dec.

Non-Redundant tRNA Reference Sequences for Deep Sequencing Analysis of tRNA Abundance and Epitranscriptomic RNA Modifications.

Genes (Basel). 2021 Jan 10;12(1):81. doi: 10.3390/genes12010081.

DeepVF: a deep learning-based hybrid framework for identifying virulence factors using the stacking strategy.

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa125.

sgRNA-PSM: Predict sgRNAs On-Target Activity Based on Position-Specific Mismatch.

Mol Ther Nucleic Acids. 2020 Jun 5;20:323-330. doi: 10.1016/j.omtn.2020.01.029. Epub 2020 Jan 31.

A Linear Regression Predictor for Identifying N-Methyladenosine Sites Using Frequent Gapped K-mer Pattern.

Mol Ther Nucleic Acids. 2019 Dec 6;18:673-680. doi: 10.1016/j.omtn.2019.10.001. Epub 2019 Oct 10.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过集成分类器改进tRNAscan-SE注释结果

Improving tRNAscan-SE Annotation Results via Ensemble Classifiers.

作者信息

Zou Quan, Guo Jiasheng, Ju Ying, Wu Meihong, Zeng Xiangxiang, Hong Zhiling

机构信息

School of Information Science and Technology, Xiamen University, Xiamen 361005, China.

School of Computer Science and Technology, Tianjin University, Tianjin 300072, China.

出版信息

Mol Inform. 2015 Nov;34(11-12):761-70. doi: 10.1002/minf.201500031. Epub 2015 Sep 14.

DOI:10.1002/minf.201500031

PMID:27491037

Abstract

摘要

通过集成分类器改进tRNAscan-SE注释结果

Improving tRNAscan-SE Annotation Results via Ensemble Classifiers.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过集成分类器改进tRNAscan-SE注释结果

Improving tRNAscan-SE Annotation Results via Ensemble Classifiers.

作者信息

机构信息

出版信息

相似文献

引用本文的文献